Sorry, now correctly cross posted.
Emmanuel

-------- Original Message --------
Subject:        WMF XML dump title case problem
Date:   Sun, 26 Jun 2011 17:07:19 +0200
From:   Emmanuel Engelhart <[email protected]>
To:     Mailing list for Wikimedia CH <[email protected]>, 
[email protected]



Hi

Titles should be stored in the table "page" with a first letter uppercased.
http://en.wikipedia.org/wiki/Wikipedia:Naming_conventions_%28technical_restrictions%29#Lower_case_first_letter

Unfortunately, it seems that we have XML dumps (and consequently
mwdumper generated SQL) containing titles with a first letter lowercased.

For example:
$wget
http://download.wikimedia.org/mywiktionary/20110617/mywiktionary-20110617-pages-articles.xml.bz2
$bzip2 -d -c mywiktionary-20110617-pages-articles.xml.bz2 | grep
"<title>"| grep tationery | more
<title>stationery</title>
<title>stationery shop</title>

Is that a bug?

Regards
Emmanuel


_______________________________________________
Wikitech-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Reply via email to