I just recently started to play with interwiki.py (Pywikipedia bot 
framework) for propagating interwiki links.  My interest comes 
from organizing the category tree, so I'm focusing on interwiki 
links between categories.  Interwiki bots normally run in 
autonomous mode, but this means they give up on complicated cases.

If I run this script under manual supervision, without the 
"-autonomous" option, it stops and asks me how to resolve each 
conflict. This happens ever so often.  I have now (manually) 
sorted out the interwiki links between all languages of 
Category:Knowledge, which was intertwined with Category:Science, 
and Category:Austrian writers which was mixed up with 
Category:Austrian literature.  Such mistakes easily happen, of 
course.  Who can spot errors in all these languages?

Many languages had interwiki links from their category for 
Austrian writers to the Japanese category for Austrian literature.  
I'm not sure exactly when or where this error originated.  But on 
June 19, 2007, the English and Spanish Wikipedia's interwiki link 
to Japanese changed from Austrian novelists to Austrian 
literature, i.e. from one error to another. Ten days later, this 
link was copied to the Dutch Wikipedia. The error was corrected on 
en.wikipedia on October 1, 2007, but remained on other languages.
Yes, that's 15 months ago.

The circular interwiki link structure from en:Category:Austrian 
writers to es:Categoría:Escritores de Austria to ja:... and back 
to en:Category:Austrian literature is such a conflict that makes 
interwiki.py give up when it runs in autonomous mode.

Thus, corrections (as on October 1) do not propagate.  Instead a 
report about the conflict is given in a logfile, but apparently 
nobody had fixed this problem in the last 15 monhts.  This 
conflict also blocked new interwiki links from propagating.  

After I cleared up the mess, 21 new interwiki links were added to 
the category on the Russian Wikipedia (one where I have a bot 
flag).  That means 21 languages of Wikipedia had created 
categories (or announced them to the interwiki system) for 
Austrian writers in the last 15 months, and they all added their 
interwiki link to the English Wikipedia.  But these additions did 
not propagate because of the conflict.

So, my question:

Has anybody mapped exactly how many such interwiki conflicts we 
have?  Or how many interwiki sets do we have without conflicts? 
Could/should someone make a list of current conflicts and try to 
rank them by importance, so we can get started in fixing them?

In the longer term, we need to redesign the interwiki links into a 
centralized system, that can be maintained.  I think the way to do 
this is to use Wikimedia Commons.  Instead of copying all the 
interwiki links to every language of Wikipedia, it should be 
enough to add {{commons|Category:Writers from Austria}}, and the 
rest should happen automatically.



-- 
  Lars Aronsson ([email protected])
  Aronsson Datateknik - http://aronsson.se

_______________________________________________
Wikitech-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Reply via email to