Dear Nemo,

    As I am waiting for a more complete response, I am not sure that I
understand your last "No" as in "No, we definitely can't" means. To
clarify, take the CLDR supplement Language-Territory information for
example
http://www.unicode.org/cldr/charts/latest/supplemental/language_territory_information.html

    One can suggest additions of the data point by submitting sourced
numbers for a geo-linguistic population like this:
http://unicode.org/cldr/trac/newticket?&description=%3Cterritory%2c%20speaker%20population%20in%20territory%2c%20and%20references%3E&summary=Add%20territory%20to%20Traditional%20Chinese%20(zh_Hant)

    In Wikipedia articles and Wikidata pages, there are many attempts to
provide more updated and better sourced data points. I see  the potentials
in exchanging such data, curating them better in Wikidata projects as more
detailed and dynamic source than the CLDR.

    These data points will have extra benefits in curating traffic data.
For one, these geo-linguistic population data points would be useful to
normalize traffic data for further analysis, such as geographic
normalization.  For another, they provide important reference data for the
development strategies and policies of the Wikipedia projects.

Best,
han-teng liao





2014-05-18 16:23 GMT+08:00 Federico Leva (Nemo) <nemow...@gmail.com>:

> Thanks for your suggestions. Just some quick pointers below.
>
> h, 18/05/2014 08:26:
>
>> (I-A). Tabulate the data points in absolute numbers first, not
>> percentage numbers [...]
>>
>> (I-B). Include all language versions for the *editing traffic* report as
>> well. [...]
>>
>> (I-C). Provide static data objects in more accessible format (i.e. csv
>> and/or json). [...]
>>
>> (II-A).  Putting viewing traffic and editing traffic report on the same
>> page. [...]
>>
>> (II-B).  Organizing and archiving the traffic reports for historical
>> comparison. [...]
>>
>> (I-C). Provide dynamic data objects in more accessible format (i.e. csv
>> and/or json).
>>
>
> At least the first four are "just" changes in the WikiStats reports
> formatting, personally I encourage you to submit patches: <
> https://git.wikimedia.org/summary/analytics%2Fwikistats.git> (should be
> the "squids" directory, but there is some ongoing refactoring of the repos).
>
> On archives and "history rewriting"/reports regeneration, see also
> https://bugzilla.wikimedia.org/show_bug.cgi?id=46198
>
>  [...] (III-B).  Smaller (i.e more specific) geographic aggregate units.
>>
>> The country (geographic) information is often based on geo-IP databases,
>> and sometimes provincial and city-level data would be available.
>>
>
> http://lists.wikimedia.org/pipermail/wikitech-l/2014-April/075964.html
>
>  [...]
>>
>>
>> ( I know that the Unicode Common Locale Data Repository (CLDR Version 25
>> <http://cldr.unicode.org/index/downloads/cldr-25>)
>> provides“language-territory”
>> <http://www.unicode.org/cldr/charts/latest/supplemental/
>> language_territory_information.html>or
>> “territory-language”
>> <http://www.unicode.org/cldr/charts/latest/supplemental/
>> territory_language_information.html>unit-based
>>
>> charts, but I believe that the Wikimedia projects can use and build one
>> better..)  [...]
>>
>
> No, we definitely can't, not alone. I've asked for help, please
> contribute: <https://www.mediawiki.org/wiki/Universal_Language_
> Selector/FAQ#How_does_Universal_Language_Selector_
> determine_which_languages_I_may_understand>.
>
>
> Nemo
>
> _______________________________________________
> Wiki-research-l mailing list
> Wiki-research-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
>
_______________________________________________
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l

Reply via email to