https://bugzilla.wikimedia.org/show_bug.cgi?id=36439
--- Comment #11 from [email protected] 2012-07-04 17:30:51 UTC --- Some results from normalization Source - encoded - normalized - comment Åland - %C3%85land - %C3%85land - codepoint for char Åland - A%CC%8Aland - %C3%85land - combining ring above Ångstrom - %E2%84%ABngstrom - %C3%85ngstrom - The initial letter is code point for an unit So seems like our current normalization (C) rewrites from capital letter A with an combining ring above into a valid code point. "Characters are decomposed and then recomposed by canonical equivalence." Seems like it only will fail in kases with multiple combining characters, but I'm not sure if that will ever happen. In my opinion, this works now, case closed. See also http://en.wikipedia.org/wiki/Unicode_normalization#Normalization -- Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug. _______________________________________________ Wikibugs-l mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/wikibugs-l
