Felipe Ortega wrote:
> Hello, all.
> 
> For (yet) unknown reasons, last complete dump files (pages-meta-history.xml) 
> in some languages are flawed. Certain revision items are missing info about 
> rev_user. Even though there are only 3 or 4 of that kind, this is enough to 
> mess up either the parsing process or the later SQL load into the DB.
> 
> So far, the last 3 dumps of DE Wikipedia and 20090603 from FR Wikipedia have 
> presented this error.
> 
> I have updated both WikiXRay parsers:
> http://meta.wikimedia.org/wiki/WikiXRay_parser
> http://meta.wikimedia.org/wiki/WikiXRay_parser_research
> 
> They now probe whether the parsed revision item is complete or not, before 
> creating the SQL. If it's flawed, its omitted and logged into an error file 
> for later inspection.
> 
> Regards,
> 
> Felipe.

They're an effect of revdelete.
You can see how they have a parameter deleted.
An example is available in the bug for pywikipediabot:
http://sourceforge.net/tracker/index.php?func=detail&aid=2790339&group_id=93107&atid=603138


_______________________________________________
Wikitech-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Reply via email to