I more or less tried to have a go at this on
http://wikinewsreporter.wordpress.com/2014/06/30/determining-the-relative-quality-of-one-wikipedia-project-to-another-one-approach-with-english-spanish-catalan-galician-argonese-and-euskera-wikipedias/
using both internal and external criteria for determining quality.
 (External being defined as what is considered good type of work on the
topic using outside, non-Wikipedia specific definitions of quality.)

Sincerely,
Laura Hale


On Tue, Jul 8, 2014 at 12:06 PM, Han-Teng Liao (OII) <
han-teng.l...@oii.ox.ac.uk> wrote:

> Thanks Jane for the comments and suggestions.
>
> Correct me if I misread your comments/suggestions, Jane.
>
> (1) Did you suggest measurements that are observable *inside*
> Wikipedia/Wikimedia websites?
> (2) If so, does it mean that your suggestion of measuring the current
> state of a language version as "a combination of the state of its content
> and community" describes only the *internal* state of that version?
> (3) When you said "zero-state", did you mean the state where the number
> of articles in a given language version is zero?
>
> Your suggestions appear to me deal with a measurement of the current state
> of a language version. The use of "zero-state" suggests the equal grounds
> for any language version to develop on the Wikipedia platform.
>
> However, my call for help focuses on the current external state out there
> external to Wikipedia platform. In this context, the term *baseline*
> suggests some languages are already *more equal* than the others because of
> the availability of language users and content out there. Since Wikipedia
> depends on reliable published secondary sources, some languages are
> *expected* to be more developed than the others. What I want to do is to
> come up such *expectation values* so that researchers and community members
> can see which language versions perform better/worse than expected, in
> comparison to other languages.
>
> While I can agree that on the Wikipedia platform, any language may have
> equal groundings when they start from zero. It is my contestation that some
> languages are already *more equal* than the other.
>
> In other words, I want to construct sensible baselines *against which* the
> development of language versions can be better understood. Such baselines
> thus should capture external factors that are likely to condition the
> development. Normalization of development metrics using such baselines can
> then control these external factors to see which language versions
> underperform even when the external availability content and users is not
> an issue. It can also help to see which language versions outperform even
> when the external conditions are not that great.
>
> Hence, I really appreciate your suggestions as potential indicators of the
> (internal) development state of a language version of Wikipedia, but they
> do not appear to capture factors that are external to Wikipedia.
>
> Best,
>
> 2014-07-08 10:09 GMT+01:00 Jane Darnell <jane...@gmail.com>:
>
>> Well as I see it, the state of any language version is a combination of
>> the state of its content and community. Going back to the zero-state, in
>> order to have permission to start a language version, there must be a "list
>> of 10,000 important topics" that has to be registered somewhere (sorry, no
>> idea where). This list for the English wikipedia includes an entry for the
>> singer Michael Jackson, one of the many articles that gets lots and lots of
>> page hits daily. Perhaps this is the case for all other languages in the
>> world (I have no idea), but I would assume one measurement going forward
>> from the zero-state would be the number of changes over time involving this
>> list in the specific language, such as
>> 1) The list itself (do these topics ever change?)
>> 2) The average number of edits and page views of those pages in the
>> specific language
>> 3) The average number of blue links per page on those pages in the
>> specific language
>> 4) The average number of editors *ever* contributing per page on those
>> pages in the specific language
>> 5) The average number of active editors contributing per page on those
>> pages in the specific language
>> ...
>>
>> Other important measurements could be the number of active editors over
>> all, the number of edits appearing in the recent changes list per
>> day/month/year, the number of pages created or deleted per day/month/year...
>>
>>
>> On Tue, Jul 8, 2014 at 9:27 AM, Han-Teng Liao (OII) <
>> han-teng.l...@oii.ox.ac.uk> wrote:
>>
>>> Dear all,
>>>
>>>      Your suggestions are needed on the ways in which one can construct
>>> some sensible baselines, most likely based on data sets *external* to
>>> Wikipedia projects, of *expected* Wikipedia language versions development.
>>>
>>>       Such baselines should ideally indicate, given the availability of
>>> language users and content (some numbers based on external data sets), a
>>> certain language version should have expected number of articles/active
>>> users.
>>>
>>>       As previous research has suggested that Wikipedia activities need
>>> mutually-reinforcing cycles of participation, content, and readership, it
>>> is expected that the development of a Wikipedia language version is
>>> conditioned by the availability of (digitally) literate users and (possibly
>>> digitized) content/sources.
>>>
>>>      So the assumption is:
>>>
>>> Wikipedia Activities = Some function of (available users and content)
>>>
>>>       For example, the major non-English writing languages in the world
>>> such as Arabic, Chinese, Spanish, etc., may have different numbers of
>>> Internet users and digital content. These numbers indicate the basis on
>>> which a Wikipedia language version can develop.
>>>
>>>       One practical use of this baseline measurement is to better
>>> categorize/curate activities across Wikipedia language versions. We can
>>> then better come up with expected values of Wikipedia development, and thus
>>> categorize language versions accordingly based on the *external conditions*
>>> of available/potential users and content.
>>>
>>>       Another use of this baseline measurement is to better compare the
>>> development of different language versions. It should help answer questions
>>> such as (1) whether Korean language version is *underdeveloped* on
>>> Wikipedia platforms when compared with a language version that enjoys
>>> similar number of available/potential users and content.
>>>
>>>      The current similar external baseline data is probably the number
>>> of language speakers. My hunch is that it is not good enough in taking into
>>> accounts the available/potential users and content, especially the
>>> digitally-ready one.
>>>
>>>       So I welcome you to add to the following list, any external
>>> indicators (and possibly data sources) that may help to construct such base
>>> line.
>>>
>>> ==Indicators==
>>>  * Internet users for each language (probably approximate measurement
>>> based on CLDR Territory-Language information and ITU internet penetration
>>> rates.
>>>
>>> * Number of books published annually in different languages (suggested
>>> data sources? Does ISBN have a database or stat report on published
>>> languages?)
>>>
>>> * Number of web pages returned by major search engines on the queries of
>>> "Wikipedia" in different languages, excluding results from Wikimedia
>>> projects.
>>>
>>> * Number of scholarly publications across languages (suggested data
>>> sources?)
>>>
>>> * Number of major newspaper publications across languages (suggested
>>> data sources?)
>>>
>>>
>>>     Please share your thoughts!
>>>
>>> --
>>> han-teng liao
>>>
>>> "[O]nce the Imperial Institute of France and the Royal Society of London
>>> begin to work together on a new encyclopaedia, it will take less than a
>>> year to achieve a lasting peace between France and England." - Henri
>>> Saint-Simon (1810)
>>>
>>> "A common ideology based on this Permanent World Encyclopaedia is a
>>> possible means, to some it seems the only means, of dissolving human
>>> conflict into unity." - H.G. Wells (1937)
>>>
>>> _______________________________________________
>>> Wiki-research-l mailing list
>>> Wiki-research-l@lists.wikimedia.org
>>> https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
>>>
>>>
>>
>> _______________________________________________
>> Wiki-research-l mailing list
>> Wiki-research-l@lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
>>
>>
>
>
>
> _______________________________________________
> Wiki-research-l mailing list
> Wiki-research-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
>
>


-- 
twitter: purplepopple
_______________________________________________
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l

Reply via email to