Great question!

The high level answer is many more than is assumed by most folks. The main challenge with calculating this number is missing interlanguage links, but in 2010, we found for instance that the English Wikipedia only covered about 41% of concepts in the Japanese Wikipedia, with a missing interlanguage link rate of only 2% (as determined by two bilingual coders going through 150 sample articles). The equivalent number for Italian was 65% with an 8% missing interlanguage link rate.

We have more detailed and up-to-date results (that are also more complicated), but this gets across the general idea.

There is also the matter of article-level diversity (e.g. what gets covered about a given concept in different language editions), but this an issue for another day.

On 4/1/2012 2:21 PM, emijrp wrote:
Hi Brent. How many articles exist in other Wikipedias and don't have an English translation at English Wikipedia? Any estimate?

2012/3/31 Brent Hecht <[email protected] <mailto:[email protected]>>

    Hello Wikidata Folks,

    My name is Brent Hecht, and I've done a great deal of research on
    the differences and similarities between the language editions. I
    was really excited to hear about the Wikidata project moving
    forward, and I think some of my research might be of assistance.
    I'd enjoy being able to help the community make this important
    transition.

    In particular, my experience navigating interlanguage link
    conflicts might be able to help in Phase 1. Please let me know if
    there's anything I can do over the short term or long term!

    Some of my relevant papers:

    [1] Bao, P., Hecht, B., Carton, S., Quaderi, M., Horn, M. and
    Gergle, D. 2012. Omnipedia: Bridging the Wikipedia Language Gap.
    CHI  '12: 30th International Conference on Human Factors in
    Computing Systems (2012).
    [2] Hecht, B. and Gergle, D. 2010. The Tower of Babel Meets Web
    2.0: User-Generated Content and Its Applications in a Multilingual
    Context. CHI  '10: 28th International Conference on Human Factors
    in Computing Systems (Atlanta, GA, 2010), 291--300.
    [3] Hecht, B. and Gergle, D. 2009. Measuring Self-Focus Bias in
    Community-Maintained Knowledge Repositories. Communities and
    Technologies 2009: 4th International Conference on Communities and
    Technologies (State College, PA, 2009), 11--19.

    - Brent

    Brent Hecht
    Ph.D. Candidate in Computer Science
    CollabLab: The Collaborative Technology Laboratory
    Northwestern University
    w:http://www.brenthecht.com  <http://www.brenthecht.com/>
    e:[email protected]  <mailto:[email protected]>








    _______________________________________________
    Wikidata-l mailing list
    [email protected] <mailto:[email protected]>
    https://lists.wikimedia.org/mailman/listinfo/wikidata-l




_______________________________________________
Wikidata-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikidata-l

_______________________________________________
Wikidata-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikidata-l

Reply via email to