Hi,

If you don't still have this thread, the background is that the
Malayam projects want to, and are, using Unicode 5.1 for five
characters that have composed code points in 5.1, and decomposed in
5.0. The equivalences are:

CHILLU NN     0D23, 0D4D, 200D        0D7A
CHILLU N       0D28, 0D4D, 200D        0D7B
CHILLU RR     0D30, 0D4D, 200D        0D7C
CHILLU L        0D32, 0D4D, 200D        0D7D
CHILLU LL      0D33, 0D4D, 200D        0D7E

Somewhere in the server code, these are "normalized" to 5.1 for the ml
projects. Problem:

http://ml.wiktionary.org/w/index.php?title=%E0%B4%95%E0%B5%81%E0%B4%B1%E0%B5%81%E0%B4%95%E0%B5%8D%E0%B4%95%E0%B5%BB&action=history

What you see happening is Interwicket trying to create the language
links. It adds the correct link(s), to the 5.0 forms on the other
wikts; then on the next scan of the language links tables it removes
the links as invalid, as the 5.1 titles don't exist on the other
wikts. This then repeats. (;-)

The problem is that it can't write the correct link, as the text
normalization "fixes" it.

The other direction isn't a problem, the links are to the 5.0 forms,
and when followed are normalized to 5.1 in the title lookup, and the
page found.

I'm not (yet) suggesting a particular solution, there are several
possibilities (from fairly decent to grotesque hackery ...). But would
someone tell me where in the server code this is done? I have not been
able to find it. Then I can understand a bit better, possibly just fix
it in the bot code somehow, or suggest a fix server-side.

Best Regards,
Robert

_______________________________________________
Wikitech-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Reply via email to