--- El dom, 8/3/09, O. O. <olson...@yahoo.com> escribió:

>     I thought that the
> pages-articles.xml.bz2 (i.e. the XML Dump) contains 
> the templates – but I did not find a way to do install it
> separately.
> 

No, it only contains a dump of the current version of each article (involving 
the page, revision and text tables in the DB).

> 
> Another thing I noticed (with the Portuguese Wiki which is
> a much 
> smaller dump than the English Wiki) is that the number of
> pages imported 
> by importDump.php and MWDumper differ i.e. importDump.php
> had much more 
> pages than MWDumper. That is way I would have preferred to
> do this using 
>   importDump.php.
> 

On download.wikimedia.org/your_lang_here you can check how many pages were 
supposed to be included in each dump.

You also have other parsers you may want to check (in my experience, my parser 
was slightly faster than mwdumper):
http://meta.wikimedia.org/wiki/WikiXRay_Python_parser

> 
> Also in a previous post, you mentioned about taking care
> about the 
> “secondary link tables”. How do I do that? Does
> “secondary links” refer 
> to language links, external links, template links, image
> links, category 
> links, page links or something else?
> 

On the same page for downloads you have a list of additional dumps in SQL 
format (then compressed with gzip). I guess you may also want to import them 
(but of course, you don't need a parser for them, they can be directly loaded 
in the DB).

Best,

F.

> Thanks for your patience
> 
> O.O.
> 
> 
> _______________________________________________
> Wikitech-l mailing list
> Wikitech-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
> 


      

_______________________________________________
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Reply via email to