https://bugzilla.wikimedia.org/show_bug.cgi?id=27112
Summary: select of revisions for stub history files does not
explicitly order revisions
Product: XML Snapshots
Version: unspecified
Platform: All
OS/Version: All
Status: NEW
Severity: enhancement
Priority: Normal
Component: General
AssignedTo: [email protected]
ReportedBy: [email protected]
CC: [email protected]
Blocks: 27110
In trunk, the query run in Export.php in dumpFrom() (used for generating stub
history files) is
SELECT * FROM `page` INNER JOIN `revision` ON ((page_id=rev_page)) WHERE
page_id >= 1157 AND page_id < 1158 ORDER BY page_id ASC;
Revisions don't get explicitly ordered. This results in the order changing
from one dump to another.
Example:
el.wiktionary dumps, page name υγεία, page id 1157, revid 1432 timestamp
2005-02-27T15:34:30Z either appears first in the revisions listed in the
stubs-meta-history file because it has the earliest timestamp, or 4th because
it's 4th if revisions are sorted by revid.
Smallest revid for that page is actually 1153 with timestamp
2005-02-27T15:34:45Z.
In fact the order seems to be chosen randomly depending on when the search is
run:
elwiktionary-20100401-stub-meta-history.xml.gz -- revid 1432 is first
elwiktionary-20100505-stub-meta-history.xml.gz -- revid 1432 is 4th, 1153 is
first
elwiktionary-20110123-stub-meta-history.xml.gz -- revid 1432 is first
Need to go through the code and make sure every such query has an explicit
order for revisions.
Also... need to find out why bigger revid has earlier timestamp (since in
theory revids get assigned in order as used).
--
Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.
_______________________________________________
Wikibugs-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l