I stripped out the <redirect />'s and imported enwiki using xml2sql,
but none of the templates rendered correctly--for example, navigating
to /The_Matrix results in a page with lots of mediawiki source like

{{#if: |This {{#ifeq:||article|page}} is about . }}For {{#if:the
series|the series|other uses}}, see {{#if:The Matrix (franchise)|The
Matrix (franchise){{#ifeq:the setting|and| and {{#if:Matrix (fictional
universe)|Matrix (fictional

Any ideas if this is a known problem with xml2sql, or did something
get corrupted during my import?
I haven't yet tried importDump.php because it seems to be extremely
slow (can only import a few pages per second)

Eric

On Fri, Feb 5, 2010 at 1:13 AM, Andrew Krizhanovsky
<[email protected]> wrote:
> Yes, it was safe in my case (import of Russian and English Wiktionary).
> See http://meta.wikimedia.org/wiki/Talk:Xml2sql
> and example of script or shell command to strip out the <redirect />
>
> -- Andrew.
>
> On Fri, Feb 5, 2010 at 6:38 AM, Eric Sun <[email protected]> wrote:
>> Would it be safe to strip out the <redirect /> tags from the xml and
>> reimport, or will that cause other problems?
>>
>> Thanks,
>> Eric
>>
>> On Thu, Feb 4, 2010 at 6:24 PM, Chad <[email protected]> wrote:
>>
>>> On Thu, Feb 4, 2010 at 9:12 PM, Eric Sun <[email protected]> wrote:
>>> > Hi,
>>> >
>>> > I saw this thread back in October where someone was having trouble
>>> > importing the English Wikipedia XML dump:
>>> > http://lists.wikimedia.org/pipermail/wikitech-l/2009-October/045594.html
>>> > The thread back in October seemed to end without resolution, and the
>>> > tools still seem to be broken, so has anyone found a solution in the
>>> > meantime?
>>> >
>>> > I'm using mediawiki-1.15.1 and attempting to import
>>> > enwiki-20100130-pages-articles.xml.bz2.
>>> >
>>> > None of these options seem to work:
>>> > 1) importDump.php
>>> > fails by spewing "Warning: xml_parse(): Unable to call handler in_()
>>> > in ./includes/Import.php on line 437" repeatedly
>>> >
>>> > 2) xml2sql (http://meta.wikimedia.org/wiki/Xml2sql):
>>> > Fails with error:
>>> > xml2sql: parsing aborted at line 33 pos 16.
>>> > due to the new <redirect> tag introduced in the new dumps?
>>> >
>>> > 3) mwdumper (http://www.mediawiki.org/wiki/MWDumper):
>>> > Current XML is schema v0.4, but the documentation says that it's for 0.3
>>> >
>>> > 4) mwimport (http://meta.wikimedia.org/wiki/Data_dumps/mwimport):
>>> > Fails immediately:
>>> > siteinfo: untested generator 'MediaWiki 1.16alpha-wmf', expect trouble
>>> ahead
>>> > page: expected closing tag in line 35
>>> >
>>> > Any tips?
>>> > Thanks!
>>> > Eric
>>> >
>>> > _______________________________________________
>>> > Wikitech-l mailing list
>>> > [email protected]
>>> > https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>>> >
>>>
>>> Most of these errors are caused by the new(ish) <redirect /> tag
>>> within <page> elements. 0.4 is the correct version of the schema,
>>> but unfortunately the schema was updated and dumps were
>>> produced using them before the changes made it into a release.
>>>
>>> 1.15.1 cannot import pages with <redirect />, we should probably
>>> backport that. That, and we should rewrite the importers to not barf
>>> terribly when they encounter an unknown element.
>>>
>>> -Chad
>>>
>>> _______________________________________________
>>> Wikitech-l mailing list
>>> [email protected]
>>> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>>>
>> _______________________________________________
>> Wikitech-l mailing list
>> [email protected]
>> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>>
>
> _______________________________________________
> Wikitech-l mailing list
> [email protected]
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>

_______________________________________________
Wikitech-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Reply via email to