OK, thank you guys. Now the reasons are clear :-). In any case, this forced the 
parser improvement, so it's welcome anyway ;).

Best,

F.

--- El lun, 15/6/09, Platonides <[email protected]> escribió:

> De: Platonides <[email protected]>
> Asunto: Re: [Wikitech-l] Fixing problem with complete dumps in WikiXRay
> Para: [email protected]
> Fecha: lunes, 15 junio, 2009 10:44
> Felipe Ortega wrote:
> > Hello, all.
> > 
> > For (yet) unknown reasons, last complete dump files
> (pages-meta-history.xml) in some languages are flawed.
> Certain revision items are missing info about rev_user. Even
> though there are only 3 or 4 of that kind, this is enough to
> mess up either the parsing process or the later SQL load
> into the DB.
> > 
> > So far, the last 3 dumps of DE Wikipedia and 20090603
> from FR Wikipedia have presented this error.
> > 
> > I have updated both WikiXRay parsers:
> > http://meta.wikimedia.org/wiki/WikiXRay_parser
> > http://meta.wikimedia.org/wiki/WikiXRay_parser_research
> > 
> > They now probe whether the parsed revision item is
> complete or not, before creating the SQL. If it's flawed,
> its omitted and logged into an error file for later
> inspection.
> > 
> > Regards,
> > 
> > Felipe.
> 
> They're an effect of revdelete.
> You can see how they have a parameter deleted.
> An example is available in the bug for pywikipediabot:
> http://sourceforge.net/tracker/index.php?func=detail&aid=2790339&group_id=93107&atid=603138
> 
> 
> _______________________________________________
> Wikitech-l mailing list
> [email protected]
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
> 


      


_______________________________________________
Wikitech-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Reply via email to