I am still able to import the dumps using the old mwDumper (modified to fix
the contributor) and xml2SQL works also and it is quiet fast. importDump.php
continues after it breaks I think.

bilal
--
Verily, with hardship comes ease.


On Thu, Feb 4, 2010 at 9:24 PM, Chad <[email protected]> wrote:

> On Thu, Feb 4, 2010 at 9:12 PM, Eric Sun <[email protected]> wrote:
> > Hi,
> >
> > I saw this thread back in October where someone was having trouble
> > importing the English Wikipedia XML dump:
> > http://lists.wikimedia.org/pipermail/wikitech-l/2009-October/045594.html
> > The thread back in October seemed to end without resolution, and the
> > tools still seem to be broken, so has anyone found a solution in the
> > meantime?
> >
> > I'm using mediawiki-1.15.1 and attempting to import
> > enwiki-20100130-pages-articles.xml.bz2.
> >
> > None of these options seem to work:
> > 1) importDump.php
> > fails by spewing "Warning: xml_parse(): Unable to call handler in_()
> > in ./includes/Import.php on line 437" repeatedly
> >
> > 2) xml2sql (http://meta.wikimedia.org/wiki/Xml2sql):
> > Fails with error:
> > xml2sql: parsing aborted at line 33 pos 16.
> > due to the new <redirect> tag introduced in the new dumps?
> >
> > 3) mwdumper (http://www.mediawiki.org/wiki/MWDumper):
> > Current XML is schema v0.4, but the documentation says that it's for 0.3
> >
> > 4) mwimport (http://meta.wikimedia.org/wiki/Data_dumps/mwimport):
> > Fails immediately:
> > siteinfo: untested generator 'MediaWiki 1.16alpha-wmf', expect trouble
> ahead
> > page: expected closing tag in line 35
> >
> > Any tips?
> > Thanks!
> > Eric
> >
> > _______________________________________________
> > Wikitech-l mailing list
> > [email protected]
> > https://lists.wikimedia.org/mailman/listinfo/wikitech-l
> >
>
> Most of these errors are caused by the new(ish) <redirect /> tag
> within <page> elements. 0.4 is the correct version of the schema,
> but unfortunately the schema was updated and dumps were
> produced using them before the changes made it into a release.
>
> 1.15.1 cannot import pages with <redirect />, we should probably
> backport that. That, and we should rewrite the importers to not barf
> terribly when they encounter an unknown element.
>
> -Chad
>
> _______________________________________________
> Wikitech-l mailing list
> [email protected]
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>
_______________________________________________
Wikitech-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Reply via email to