I second this!
btw: what is the status of the problem with the missing dumps with
history? (latest available from November 2014)
Lukas
Am Do 26.02.2015 um 14:52 schrieb Markus Kroetzsch:
> Hi,
>
> It's that time of the year again when I am sending a reminder that we
> still have broken JSON in the dump files ;-). As usual, the problem is
> that empty maps {} are serialized wrongly as empty lists []. I am not
> sure if there is any open bug that tracks this, so I am sending an
> email. There was one, but it was closed [1].
>
> As you know (I had sent an email a while ago), there are some remaining
> problems of this kind in the JSON dump, and also in the live exported
> JSON, e.g.,
>
> https://www.wikidata.org/wiki/Special:EntityData/Q4383128.json
> (uses [] as a value for snaks: this item has a reference with an empty
> list of snaks, which is an error by itself)
>
> However, the situation is considerably worse in the XML dumps, which
> have seen less usage since we have JSON, but as it turns out are still
> preferred by some users. Surprisingly (to me), the JSON content in the
> XML dumps is still not the same as in the JSON dumps. A large part of
> the records in the XML dump is broken because of the map-vs-list issue.
>
> For example, the latest dump of current revisions [2] has countless
> instances of the problem. The first is in the item Q3261 (empty list for
> claims), but you can easily find more by grepping for things like
>
> "claims":[]
>
> It seems that all empty maps are serialized wrongly in this dump
> (aliases, descriptions, claims, ...). In contrast, the site's export
> simply omits the key of empty maps entirely, see
>
> https://www.wikidata.org/wiki/Special:EntityData/Q3261.json
>
> The JSON in the JSON dumps is the same.
>
> Cheers,
>
> Markus
>
>
> [1] https://github.com/wmde/WikibaseDataModelSerialization/issues/77
> [2]
> http://dumps.wikimedia.org/wikidatawiki/20150207/wikidatawiki-20150207-pages-meta-current.xml.bz2
>
>
signature.asc
Description: OpenPGP digital signature
_______________________________________________ Wikidata-l mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/wikidata-l
