https://bugzilla.wikimedia.org/show_bug.cgi?id=23264

           Summary: Dumps twisted in several languages
           Product: Wikimedia
           Version: unspecified
          Platform: All
        OS/Version: All
            Status: NEW
          Severity: critical
          Priority: Normal
         Component: Downloads
        AssignedTo: [email protected]
        ReportedBy: [email protected]


Many dumps are twisted corrupted in many languages.

While syntactically correct, titles do not correspond to content.

e.g. "A mír na Zemi!" in the czech wiki, has the text of "singapore" in the
dump. I've discovered this all across the languages - seems not to affect
all articles though. (cswiki dump as of 20100411)
If you need more examples, I can provide them



  <page>
    <title>A mír na Zemi!</title>
    <id>70749</id>
    <revision>
      <id>5178497</id>
      <timestamp>2010-04-03T22:56:32Z</timestamp>
      <contributor>
        <username>Chalupa</username>
        <id>3656</id>
      </contributor>
      <comment>obrázek z commons</comment>
      <text xml:space="preserve">{{Infobox stát|
    genitiv = Singapuru
  | úřední název = Republic of Singapore&lt;br /&gt;新加坡共和国&lt;br /&gt;Republik
Singapura&lt;br /&gt;சிங்கப்பூர் குடியரசு
  | vlajka = Flag of Singapore.svg
  | článek o vlajce = Singapurská vlajka
  | znak =
  | mapa umístění = LocationSingapore.png
...

-- 
Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.
_______________________________________________
Wikibugs-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l

Reply via email to