On 22 September 2013 15:54, Amir Ladsgroup <[email protected]> wrote:
> Hello, > Persian Wikipedia is one of the largest wikis based on number of categories > but It's not very common that people consider adding interwiki of > categories (they think interwiki is just for articles) so we have tons of > tons (before writing my engine that was 30K out of 170K) categories without > any interwikis which is really bad. I wrote some codes to make it better > but It wasn't enough So I wrote an engine that gets two database: 1-list of > categories without interwiki 2-list of categories with interwiki to a > certain language (e.g. English) with the target interwiki and after that my > bot analyzes and "guess" what is the correct interwiki of category based on > patterns of naming them in the second database > and bot reports. After running this code on fa.wp there was a very huge > report [1] and we started to sort things out (merging duplicates [2], > deleting extra ones, adding the correct iw) and now it's less than 25K > categories without interwikis (and It's becoming less and less) we did the > same on templates namespace [3] and we interwikified more than 10K > templates after that. > > And because this engine doesn't use any language-related analyses It can be > ran in any language and get interwiki from any language (we planned to run > this on Persian Wikipedia again but this time we use Dutch and German > languages as repo of interwiki) > > So here is my question: Is there similar situation in your wiki? Do you > want to run this code in your wiki too? Do you have any suggestion? > [1]: https://fa.wikipedia.org/w/index.php?title=کاربر:Ladsgroup/ > ردهها&oldid=10959457 > [2]: One of the benefits of running this engine is we can find duplicates > [3]: > > https://fa.wikipedia.org/wiki/%DA%A9%D8%A7%D8%B1%D8%A8%D8%B1:Ladsgroup/%D8%A7%D9%84%DA%AF%D9%88%D9%87%D8%A7%DB%8C > Best > --- > Hi Amir - I have a different question. Why is it in the interests of Fawiki to use the same categorization system as any other project? I ask this because I know that almost every Wikipedia has variations in the way that it categorizes articles and other pages, and there is not really a cross-wiki standard - nor would I expect one. Categorization is more or less in the same realm as defining notability, determining neutral point of view, and Manuals of Style: while philosophically we are very similar across all the Wikipedias, each project has a slightly different way of addressing these situations. I'd suggest that the issue isn't really a technical problem, it's more a cultural one. That is, Wikipedia community cultures have developed categorization systems slightly differently, so it is unlikely that any one will be a perfect match for another. Risker/Anne _______________________________________________ Wikitech-l mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/wikitech-l
