Perhaps you could just imitate the QRpedia model, which says, this article is not available in your default language, and serve up links to the languages it *IS* available in. After all, presence on Wikidata means presence on *at least one Wikipedia*, if I'm not mistaken.
2013/4/25, Erik Moeller <e...@wikimedia.org>: > Millions of Wikidata stubs invade small Wikipedias .. Volapük > Wikipedia now best curated source on asteroids .. new editors flood > small wikis .. Google spokesperson: "This is out of control. We will > shut it down." > > Denny suggested: > >>> II ) develop a feature that blends into Wikipedia's search if an article >>> about a topic does not exist yet, but we have data on Wikidata about >>> that >>> topic > > Andrew Gray responded: > >> I think this would be amazing. A software hook that says "we know X >> article does not exist yet, but it is matched to Y topic on Wikidata" >> and pulls out core information, along with a set of localised >> descriptions... we gain all the benefit of having stub articles >> (scope, coverage) without the problems of a small community having to >> curate a million pages. It's not the same as hand-written content, but >> it's immeasurably better than no content, or even an attempt at >> machine-translating free text. >> >> XXX is [a species of: fish] [in the: Y family]. It [is found in: Laos, >> Vietnam]. It [grows to: 20 cm]. (pictures) > > This seems very doable. Is it desirable? > > For many languages, it would allow hundreds of thousands of > pseudo-stubs (not real articles stored in the DB, but generated from > Wikidata) to be served to readers and crawlers that would otherwise > not exist in that language. > > Looking back 10 years, User:Ram-Man was one of the first to generate > thousands of en.wp articles from, in this case, US census data. It was > controversial at the time and it stuck. Other Wikipedias have since > then either allowed or prohibited bot-creation of articles on a > project-by-project basis. It tends to lead to frustration when folks > compare article counts and see artificial inflation by bot-created > content. > > Does anyone know if the impact of bot-creation on (new) editor > behavior has been studied? I do know that many of the Rambot articles > were expanded over time, and I suspect many wouldn't have been if they > hadn't turned up in search engines in the first place. On the flip > side, a large "surface area" of content being indexed by search > engines will likely also attract a fair bit of drive-by vandalism that > may not be detected because those pages aren't watched. > > A model like the proposed one might offer a solution to a lot of these > challenges. How I imagine it could work: > > * Templates could be defined for different Wikidata entities. We could > make it possible to let users add links from items in Wikidata to > Wikipedia articles that don't exist yet. (Currently this is > prohibited.) If such a link is added, _and_ a relevant template is > defined for the Wikidata entity type (perhaps through an entity > type->template mapping), WP will render an article using that > template, pulling structured info from Wikidata. > > * A lot of the grammatical rules would be defined in the template > using checks against the Wikidata result. Depending on the complexity > of grammatical variations beyond basics such as singular/plural this > might require Lua scripting. > > * The article is served as a normal HTTP 200 result, cached, and > indexed by search engines. In WP itself, links to the article might > have some special affordance that suggests that they're neither > ordinary red links nor existing articles. > > * When a user tries to edit the article, wikitext (or visual edit > mode) is generated, allowing the user to expand or add to the > automatically generated prose and headings. Such edits are tagged so > they can more easily be monitored (they could also be gated by default > if the vandalism rate is too high). > > * We'd need to decide whether we want these pages to show up in > searches on WP itself. > > Advantages: > > * These pages wouldn't inflate page counts, but they would offer > useful information to readers and be higher quality than machine > translation. > > * They could serve as powerful lures for new editors in languages that > are currently underrepresented on the web. > > Disadvantages/concerns: > > * Depending on implementation, I continue to have some concern about > {{#property}} references ending up in article text (as opposed to > templates); these concerns are consistent with the ones expressed in > the en.wp RFC [1]. This might be mitigated if Visual Editor offers a > super-intuitive in-place editing method. {{#property}} references in > text could also be converted to their plain text representation the > moment a page is edited by a human being (which would have its own set > of challenges, of course). > > * How massive would these sets of auto-generated articles get? I > suspect the technical complexity of setting up the templates and > adding the links in Wikidata itself would act as a bit of a barrier to > entry. But vast pseudo-article sets in tiny languages could pose > operational challenges without adding a lot of value. > > * Would search engines penalize WP for such auto-generated content? > > Overall, I think it's an area where experimentation is merited, as it > could not only expand information in languages that are > underrepresented on the web, but also act as a force multiplier for > new editor entrypoints. It also seems that a proof-of-concept for > experimentation in a limited context should be very doable. > > Erik > > [1] > https://en.wikipedia.org/wiki/Wikipedia:Requests_for_comment/Wikidata_Phase_2#Use_of_Wikidata_in_article_text > -- > Erik Möller > VP of Engineering and Product Development, Wikimedia Foundation > > _______________________________________________ > Wikimedia-l mailing list > Wikimedia-l@lists.wikimedia.org > Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l > _______________________________________________ Wikimedia-l mailing list Wikimedia-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l