https://bugzilla.wikimedia.org/show_bug.cgi?id=52728

--- Comment #1 from Tim Starling <[email protected]> ---
The query would have been of the form:

SELECT user_name, user_editcount, MAX(rev_timestamp) as lastedit
FROM user, revision
WHERE user_id = rev_user
GROUP BY user_name

MAX() is optimised in various contexts, but I guess this isn't one of them.
EXPLAIN indicates that this query scans the entire revision table, which would
explain the long query times. This ungrouped query is efficient:

SELECT MAX(rev_timestamp) 
FROM revision,user 
WHERE rev_user=user_id AND user_name='Tim Starling';

So I guess it is the grouping that stops it from hitting a special-case
optimisation of MAX(). The traditional way to retrieve a row from the end of a
range, which does not rely on MAX() optimisations, is with ORDER BY and LIMIT:

SELECT rev_timestamp
FROM revision,user 
WHERE rev_user=user_id AND user_name='Tim Starling'
ORDER BY rev_timestamp DESC
LIMIT 1

But even if that was done, it would still be extremely inefficient and would
not work. All the special page wants to do is display statistics about
translators listed on e.g. [[Portal:Fr/translators]] for French translators,
and there are no such translator lists on Commons, so actually it is trying to
calculate statistics about nobody in order to display an empty table. In order
to display that empty table, it is fetching edit count statistics for all 3
million users on the wiki. Even if it managed to get the queries done, the
memcached set would fail due to the value size being larger than 1MB.

Obviously, it should do a single ungrouped query like the one above for each
translator, not for every user on the wiki.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are on the CC list for the bug.
_______________________________________________
Wikibugs-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l

Reply via email to