Here are some bad and some good news...

The bad news is that I've finally realized why I needed a separate
wiki for data. It's about restrictive Ethnologue's ToS [1]. In other
words, I could say to myself just: Welcome back to the wonderful world
of licenses!

So, I've created a private wiki with some of the data. Anyone willing
to join me in "data analysis" work is welcome; I'll create accounts on
that wiki. Said so, I urge to all relevant persons to contact me
privately with preferred username. (And if I have to be more precise,
this is related to the languages, chapters, WMF and its funds.) I also
need one or more persons willing to code in Python.

Good news is that I've realized that I did good job in coding, with a
number of relevant categorizations; which triggers a bad news because
I'd need some time to get familiarized with my code again.

The data about the number of not represented languages on Wikimedia projects:
* 23 languages with more than 10 millions of speakers
* 230 languages with more than one million of speakers
* 866 languages with more than 100 thousands of speakers
* 1831 languages with more than 10 thousands of speakers

The largest language with the project in Incubator has 38 millions of speakers.


On Sat, Apr 26, 2014 at 2:11 PM, Seb35 <> wrote:
> Hei,
> As a supporter of language diversity, I'm a bit sad of this thread because
> some people find we should not engage in language revitalisation because:
> 1/ it's not explicitely in our scope (and I don't fully aggree: "sum of
> all knowledge" also includes minority cultures expressed in their
> languages, as shown by Hubert Laska with the "Kneip"),
> 2/ it's too difficult/expansive "to save most languages".
> Although there are obviously great difficulties, I find it shouldn't stop
> us to support or partnership with local languages institutions,
> particularly if there are interested people or volunteers: we are not
> obliged to select the 3000 more spoken languages and set up parterships to
> "save" these 3000 languages, but we can support institutions or volunteers
> _interested_ in saving some small language on a case-by-case basis (Rapa
> Nui, Chickasaw, Skolt Sami, Kibushi, whatever) if minimum requirements are
> met (writing system and ISO 639 code for a website, financial ressources
> for a project), i.e. crowdsourcing the language preservation between
> Wikimedia, volunteers, speakers, and institutions.
> When multilinguism in the cyberspace is discussed by linguists, Wikipedia
> is almost every time shown as *the* better successful example. As
> discussed in this thread, perhaps some projects (Wikisource, Wiktionary,
> Wikidata) are easier to set up in these languages and this could be a
> first step, but these will only preserve these as non-living objects of
> interest, at the contrary of a Wikibook/Wikipedia/Wikinews/Wikiversity
> where speakers could practice the language, invent neologisms and
> terminology, create corpora for linguists, and show the language to other
> interested people in the world (I'm sure there are).
> As an example in France, Wikimédia France has quite good relationships
> with the DGLFLF (Delegation for the French language and languages of
> France), and this institution census 75 languages in France, whose 2/3 are
> overseas [1]. The DGLFLF contributed ressources on some small languages
> and multilinguism on Wikibooks [2] and Commons [3].
> [1] (fr)
> [2] (fr)
> [3] (fr)(mul)
> ~ Seb35
> 20.04.2014 05:46:47 (CEST), Milos Rancic kirjoitti:
>> There are ~6000 languages in the world and around 3000 of them have
>> more than 10,000 speakers.
>> That approximation has some issues, but they are compensated by the
>> ambiguity of the opposition. Ethnologue is not the best place to find
>> precise data about the languages and it could count as languages just
>> close varieties of one language, but it also doesn't count some other
>> languages. Not all of the languages with 10,000 or more speakers have
>> positive attitude toward their languages, but there are languages with
>> smaller number of speakers with very positive attitude toward their
>> own language.
>> So, that number is what we could count as the realistic "final" number
>> of the language editions of Wikimedia projects. At the moment, we have
>> less than 300 language editions.
>> * * *
>> There is the question: Why should we do that? The answer is clear to
>> me: Because we can.
>> Yes, there are maybe more specific organizations which could do that,
>> but it's not about expertise, but about ability. Fortunately, we don't
>> need to search for historical examples for comparisons; the Internet
>> is good enough.
>> I still remember infographic of the time while all of us thought that
>> Flickr is the place for images. It turned out that the biggest
>> repository of images is actually Facebook, which had hundred times
>> more of them than the Twitpic at the second place, which, in turn, had
>> hundred times more of images than Flickr.
>> In other words, the purpose of something and general perception of its
>> purpose is not enough for doing good job. As well as comparisons
>> between mismanaged internet projects and mismanaged traditional
>> scientific and educational organizations are numerous.
>> At this point of time Wikimedia all necessary capacities -- and even a
>> will to take that job. So, we should start doing that, finally :)
>> * * *
>> There is also the question: How can we do that? In short, because of
>> Wikipedia.
>> I announced Microgrants project of Wikimedia Serbia yesterday. To be
>> honest, we have very low expectations. When I said to Filip that I
>> want to have 10 active community members after the project, he said
>> that I am overambitious. Yes, I am.
>> But ten hours later I've got the first response and I was very
>> positively surprised by a lot of things. The most relevant for this
>> story is that a person from a city in Serbia proper is very
>> enthusiastic about Wikipedia and contributing to it (and organizing
>> contributors in the area). I didn't hear that for years! (Maybe I was
>> just too pessimistic because of my obsession with statistics.)
>> Keeping in mind her position (she said that she was always complaining
>> about lack of material on Serbian Wikipedia, although at this point of
>> time it's the encyclopedia in Serbian with the most relevant content)
>> and her enthusiasm, I am completely sure that many speakers of many
>> small languages are dreaming from time to time to have Wikipedia in
>> their native language.
>> Like in the case of a Serbian from the fifth or sixth largest city in
>> Serbia, I am sure that they just don't know how to do that. So, it's
>> up to us to reach them.
>> English Wikipedia has some influences on contemporary English language
>> ("citation needed", let's say). It has more influences on languages
>> with smaller number of speakers, like Serbian is (Cyrillic/Latin
>> cultural war in Serbia was over at the moment when Serbian Wikipedia
>> implemented transliteration engine; it's no issue now, while it was
>> the issue up to mid 2000s).
>> But it's about well developed languages in the cultural sense. What
>> about not that developed ones? While I don't have an example of the
>> effects (anyone, please?), counting the amount of the written
>> materials in some languages, Wikipedia will (or already has) become
>> the biggest book, sometimes the biggest library in that language; in
>> some cases Wikipedia will create the majority of texts written in
>> particular language!
>> While we think about Wikipedia as valuable resource for learning about
>> wide range of the topics, significance of Wikipedia for those peoples
>> would be much higher. If we do the job, there will be many monuments
>> to Wikipedia all over the world, because Wikipedia would preserve many
>> cultures, not just the languages.
>> * * *
>> There is the question "How?", at the end. There are numerous of
>> possible ways and there are also some tries to do that, but we have to
>> create the plan how to do that systematically, well, according to our
>> principles and goals and according to the reality.
>> What we know from our previous experiences:
>> * The number of editors has declined and, at the moment, without a
>> miracle (or hard work, but I assume the most of our movement is used
>> to miracles, not to hard work), the trend will continue. Contrary to
>> that, number of readers has increased. Unfortunately, in this case a
>> miracle is not necessary for that trend to end.
>> * If we count languages with relevant statistics for editors per
>> million, the top of them belong either to the highly motivated
>> communities (Hebrew), either to the rich countries with harsh climate,
>> which makes writing on Wikipedia as a good fun (Estonian, Icelandic,
>> Norwegian, Finish), either to the community which belongs to the both
>> categories (Scots Gaelic). And it's around 100 users per million.
>> If a community has 100,000 of speakers, it would mean that the
>> community would have 10 editors with 5 or more edits per month. In the
>> cases of the languages with 10,000 of speakers, it would mean 1 editor
>> with 5 or more edits per month. That won't work.
>> I'd say that Scots Gaelic could be a good test (Wikimedia UK help
>> needed!). It's a language with ~70k of speakers and if it's possible
>> to achieve 100 active editors per month, we could say that it could
>> somehow work in other cases, as well.
>> * Besides preserving languages and cultural heritage, we want to have
>> useful information on those Wikipedias. That's a tough job for many
>> communities because of various issues: from the lack of reasonable
>> internet access to the inherent cultural biases.
>> But we have some tools -- Wikidata as the most important one -- to
>> create a lot of useful content.
>> But the entrance level is very high. Editors have to know to use
>> computers well, as well as to think quite formally. That's serious
>> obstacle in areas without well developed educational systems.
>> * Good news is that we have chapters in three countries with a lot of
>> languages: India, Indonesia and Australia (though, it's about very
>> small languages in Australia; though, Australia is much richer). So,
>> we have organizational potential.
>> * There are, of course, a lot of other issues. Many of them, actually.
>> But if we wouldn't start, we wouldn't do anything.
>> * * *
>> As you could see, I wrote this not as a kind of plan, but as the set
>> of open questions. I'd like your input (first here, then on Meta):
>> What do you think? How can we start working on it? What do you think
>> it would be the most efficient way? Ways? Any other idea?
>> I'd call you to give wings to your imagination. To be able to solve
>> that, we need bold ideas. At the other side, I'd appreciate people
>> with more organizational skills to give their input, as well.
> _______________________________________________
> Wikimedia-l mailing list
> Unsubscribe:,
> <>

Wikimedia-l mailing list

Reply via email to