I read much technical details on this thread on how collation and 
sorting is extremely complex.
I hereby admit, that I don't understand all of the database dependencies 
and collation nifles or whatever else may be the limiting factors that 
play a role here. Perhaps I shouldn't participate in a tech discussion 
that I don't fully understand, but take me for a wiki user who spends 
many hour to add defaultsort statements to articles and doesn't 
understand why the software cannot do it by itself. Perhaps you can shed 
some more light on it for a dummie like me.

Here is, what I in my simple mind think, how it would be solvable (I'm 
sure my thoughts are too simple, but I want to understand, why and in 
what way they are too simple) . As an example I take the German language:

Take the pagename and make it uppercase (could be lowercase too, but 
uppercase seems better as the first letter will show up in the 
category). str_replace "Ä" with "A", "Ö" with "O", "Ü" with "U" and "ß" 
with "SS". Also str_replace other Latin characters with diacritics with 
their counterpart without diacritic. And that's our sortkey. This very 
simple procedure should reduce the number of necessary defaultsorts 
(except for articles about persons) by about 90% in the German wikipedia.

Implement these steps directly in the software and it should fix the 
sorting of categories. I read much about uniqueness in the thread, but 
defaultsort isn't unique either.

Of course it only works for languages where the unicode byte order of 
the basic script correspondends with the sorting order. But a solution 
helping 80% of the languages in 80% of all cases (and with no 
disadvantages for the other 20%) is better than a solution that helps 
100% of all languages in 100% of all cases, but that does not exist yet, 
doesn't it?

Marcus Buck
User:Slomox

_______________________________________________
Wikitech-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Reply via email to