Hoi,
At the WMF language committee, the question if a language is viable for a
Wikimedia project is a practical one. It is also very much a political one.
One vitally important difference with your approach is that the distinction
is between a first project and a subsequent project. In the latest
iteration of the approach we do not consider Wikidata a first project.
Relevance is that we do not require localisation of MediaWiki or an
Incubator stage.
When the question is what it takes for a new project to work? .. the simple
answer is "a few good men". There are a few projects that are alive and
well that rely on no more than 3 people.
By not focussing on Wikipedia, it is possible that a Wikisource becomes the
first project. When this is what those "few good men" want.. It is their
party.
You may imagine that we thought about what are the likely success factors
for a new project. We did come up with similar ideas that you have. The
problem is that it does not help. So you determine the likelihood of
success, it does not guarantee it.
What we certainly do not consider is the number of data sources. Sourcing
is very much a luxury in starting projects. Insisting on sourcing at all
will kill most initiatives immediately. What is important is that people
start writing, reading in their language.. With a Wikipedia that gets
active participation / readership, there will be a move to a more
consistent orthography. Those that write determine in the end.
Wikidata was given its exception because it represents the lowest level of
participation with the most effect. Add one label to an item that is used a
lot (human, male, female eg) and it can be used thousands of times. It is
also very obvious to re-use dictionary information to make an impact.
Thanks,
GerardM
On 8 July 2014 09:27, Han-Teng Liao (OII) <[email protected]>
wrote:
> Dear all,
>
> Your suggestions are needed on the ways in which one can construct
> some sensible baselines, most likely based on data sets *external* to
> Wikipedia projects, of *expected* Wikipedia language versions development.
>
> Such baselines should ideally indicate, given the availability of
> language users and content (some numbers based on external data sets), a
> certain language version should have expected number of articles/active
> users.
>
> As previous research has suggested that Wikipedia activities need
> mutually-reinforcing cycles of participation, content, and readership, it
> is expected that the development of a Wikipedia language version is
> conditioned by the availability of (digitally) literate users and (possibly
> digitized) content/sources.
>
> So the assumption is:
>
> Wikipedia Activities = Some function of (available users and content)
>
> For example, the major non-English writing languages in the world
> such as Arabic, Chinese, Spanish, etc., may have different numbers of
> Internet users and digital content. These numbers indicate the basis on
> which a Wikipedia language version can develop.
>
> One practical use of this baseline measurement is to better
> categorize/curate activities across Wikipedia language versions. We can
> then better come up with expected values of Wikipedia development, and thus
> categorize language versions accordingly based on the *external conditions*
> of available/potential users and content.
>
> Another use of this baseline measurement is to better compare the
> development of different language versions. It should help answer questions
> such as (1) whether Korean language version is *underdeveloped* on
> Wikipedia platforms when compared with a language version that enjoys
> similar number of available/potential users and content.
>
> The current similar external baseline data is probably the number of
> language speakers. My hunch is that it is not good enough in taking into
> accounts the available/potential users and content, especially the
> digitally-ready one.
>
> So I welcome you to add to the following list, any external
> indicators (and possibly data sources) that may help to construct such base
> line.
>
> ==Indicators==
> * Internet users for each language (probably approximate measurement
> based on CLDR Territory-Language information and ITU internet penetration
> rates.
>
> * Number of books published annually in different languages (suggested
> data sources? Does ISBN have a database or stat report on published
> languages?)
>
> * Number of web pages returned by major search engines on the queries of
> "Wikipedia" in different languages, excluding results from Wikimedia
> projects.
>
> * Number of scholarly publications across languages (suggested data
> sources?)
>
> * Number of major newspaper publications across languages (suggested data
> sources?)
>
>
> Please share your thoughts!
>
> --
> han-teng liao
>
> "[O]nce the Imperial Institute of France and the Royal Society of London
> begin to work together on a new encyclopaedia, it will take less than a
> year to achieve a lasting peace between France and England." - Henri
> Saint-Simon (1810)
>
> "A common ideology based on this Permanent World Encyclopaedia is a
> possible means, to some it seems the only means, of dissolving human
> conflict into unity." - H.G. Wells (1937)
>
> _______________________________________________
> Wiki-research-l mailing list
> [email protected]
> https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
>
>
_______________________________________________
Wiki-research-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l