https://bugzilla.wikimedia.org/show_bug.cgi?id=42396
Nemo <[email protected]> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |[email protected] See Also| |https://bugzilla.wikimedia. | |org/show_bug.cgi?id=37459, | |https://bugzilla.wikimedia. | |org/show_bug.cgi?id=37754 --- Comment #1 from Nemo <[email protected]> --- This bug is probably too general to be useful (perhaps transform into a tracking bug?), but as we have another equally general report let me copy it here: ---- Small update: I went through the language list at https://github.com/mkroetzsch/wda/blob/master/includes/epTurtleFileWriter.py#L472 and added a number of TODOs to the most obvious problematic cases. Typical problems are: * Malformed language codes ('tokipona') * Correctly formed language codes without any official meaning (e.g., 'cbk-zam') * Correctly formed codes with the wrong meaning (e.g., 'sr-ec': Serbian from Ecuador?!) * Language codes with redundant information (e.g., 'kk-cyrl' should be the same as 'kk' according to IANA, but we have both) * Use of macrolanguages instead of languages (e.g., "zh" is not "Mandarin" but just "Chinese"; I guess we mean Mandarin; less sure about Kurdish ...) * Language codes with incomplete information (e.g., "sr" should be "sr-Cyrl" or "sr-Latn", both of which already exist; same for "zh" and "zh-Hans"/"zh-Hant", but also for "zh-HK" [is this simplified or traditional?]). ---- Small update: I went through the language list at https://github.com/mkroetzsch/wda/blob/master/includes/epTurtleFileWriter.py#L472 and added a number of TODOs to the most obvious problematic cases. Typical problems are: * Malformed language codes ('tokipona') * Correctly formed language codes without any official meaning (e.g., 'cbk-zam') * Correctly formed codes with the wrong meaning (e.g., 'sr-ec': Serbian from Ecuador?!) * Language codes with redundant information (e.g., 'kk-cyrl' should be the same as 'kk' according to IANA, but we have both) * Use of macrolanguages instead of languages (e.g., "zh" is not "Mandarin" but just "Chinese"; I guess we mean Mandarin; less sure about Kurdish ...) * Language codes with incomplete information (e.g., "sr" should be "sr-Cyrl" or "sr-Latn", both of which already exist; same for "zh" and "zh-Hans"/"zh-Hant", but also for "zh-HK" [is this simplified or traditional?]). -- You are receiving this mail because: You are on the CC list for the bug. _______________________________________________ Wikibugs-l mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/wikibugs-l
