OK, thank you guys. Now the reasons are clear :-). In any case, this forced the parser improvement, so it's welcome anyway ;).
Best, F. --- El lun, 15/6/09, Platonides <[email protected]> escribió: > De: Platonides <[email protected]> > Asunto: Re: [Wikitech-l] Fixing problem with complete dumps in WikiXRay > Para: [email protected] > Fecha: lunes, 15 junio, 2009 10:44 > Felipe Ortega wrote: > > Hello, all. > > > > For (yet) unknown reasons, last complete dump files > (pages-meta-history.xml) in some languages are flawed. > Certain revision items are missing info about rev_user. Even > though there are only 3 or 4 of that kind, this is enough to > mess up either the parsing process or the later SQL load > into the DB. > > > > So far, the last 3 dumps of DE Wikipedia and 20090603 > from FR Wikipedia have presented this error. > > > > I have updated both WikiXRay parsers: > > http://meta.wikimedia.org/wiki/WikiXRay_parser > > http://meta.wikimedia.org/wiki/WikiXRay_parser_research > > > > They now probe whether the parsed revision item is > complete or not, before creating the SQL. If it's flawed, > its omitted and logged into an error file for later > inspection. > > > > Regards, > > > > Felipe. > > They're an effect of revdelete. > You can see how they have a parameter deleted. > An example is available in the bug for pywikipediabot: > http://sourceforge.net/tracker/index.php?func=detail&aid=2790339&group_id=93107&atid=603138 > > > _______________________________________________ > Wikitech-l mailing list > [email protected] > https://lists.wikimedia.org/mailman/listinfo/wikitech-l > _______________________________________________ Wikitech-l mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/wikitech-l
