Felipe Ortega wrote: > Hello, all. > > For (yet) unknown reasons, last complete dump files (pages-meta-history.xml) > in some languages are flawed. Certain revision items are missing info about > rev_user. Even though there are only 3 or 4 of that kind, this is enough to > mess up either the parsing process or the later SQL load into the DB. > > So far, the last 3 dumps of DE Wikipedia and 20090603 from FR Wikipedia have > presented this error. > > I have updated both WikiXRay parsers: > http://meta.wikimedia.org/wiki/WikiXRay_parser > http://meta.wikimedia.org/wiki/WikiXRay_parser_research > > They now probe whether the parsed revision item is complete or not, before > creating the SQL. If it's flawed, its omitted and logged into an error file > for later inspection. > > Regards, > > Felipe.
They're an effect of revdelete. You can see how they have a parameter deleted. An example is available in the bug for pywikipediabot: http://sourceforge.net/tracker/index.php?func=detail&aid=2790339&group_id=93107&atid=603138 _______________________________________________ Wikitech-l mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/wikitech-l
