https://bugzilla.wikimedia.org/show_bug.cgi?id=36439

--- Comment #11 from [email protected] 2012-07-04 17:30:51 UTC ---
Some results from normalization
Source   - encoded     - normalized - comment
Åland    - %C3%85land  - %C3%85land - codepoint for char
Åland    - A%CC%8Aland - %C3%85land - combining ring above
Ångstrom - %E2%84%ABngstrom - %C3%85ngstrom - The initial letter is code point
for an unit

So seems like our current normalization (C) rewrites from capital letter A with
an combining ring above into a valid code point.

"Characters are decomposed and then recomposed by canonical equivalence."

Seems like it only will fail in kases with multiple combining characters, but
I'm not sure if that will ever happen.

In my opinion, this works now, case closed.

See also http://en.wikipedia.org/wiki/Unicode_normalization#Normalization

-- 
Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.
_______________________________________________
Wikibugs-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l

Reply via email to