https://bugzilla.wikimedia.org/show_bug.cgi?id=53754
Siddhartha Ghai <[email protected]> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |[email protected] Summary|Marathi/Devanagari:Backspac |VisualEditor: Devanagari: |e deletes combined |Backspace deletes combined |character clusters together |character clusters together |with anuswara diacritics |with diacritics --- Comment #1 from Siddhartha Ghai <[email protected]> --- Per Bug 51472#c4 , the grapheme cluster handling for backspace is to be on a per script basis. So, this should be treated as the bug for specifically devanagari. Also note that I am confirming the bug for hindi. To further clarify the original report, devanagari has various diacritics which can be applied to base unicode characters. It also has a combining character halant (viram) ् (U+094D). Currently, pressing backspace after a grapheme cluster containing one or more base characters with one or more diacritics and/or combining character deletes the entire grapheme cluster. This is not desired behaviour. Pressing delete before a cluster deletes the entire cluster. This is desired behaviour. Examples of diacritics: ँ (Chandrabindu) U+0901 ं (Bindu) U+0902 etc. Examples of grapheme clusters: One base character with one diacritic: कं ( क + ं ), कँ ( क + ँ ), कः ( क + ः ) One base character with multiple diacritics: किं ( क + ि + ं ) Multiple base characters with halant: श्र ( श + ् + र ), क्ष ( क + ् + ष ), प्र ( प + ् + र ) Multiple base characters with halant followed by diacritics: श्रिं (श + ् + र + ि + ं), क्षि ( क + ् + ष + ि ), प्रे ( प + ् + र + े ) System environment: Win7 X64 Google Chrome 29.0.1547.62 m Page used for testing: [[:w:hi:User:Siddhartha Ghai/sandbox]] Expected behaviour: Only one diacritic (the last one in the grapheme), ie one unicode character, is to be deleted. The rest of the grapheme cluster is to stay intact. Examples used (not exhaustive): Grapheme -> Grapheme after pressing backspace कं -> क कँ -> क कः -> क क् -> क किं -> कि श्र -> श् क्ष -> क् प्र -> प् श्रिं -> श्रि क्षि -> क्ष प्रे -> प्र Current behaviour (blank indicates entire grapheme cluster was removed) (these results should be verified on other browser/OS combinations): कं -> कँ -> कः -> क् -> किं -> श्र -> श् (Working correctly) क्ष -> क् (Working correctly) प्र -> प् (Working correctly) श्रिं -> श् (Deletes र + ि + ं , ie three unicode characters instead of one) क्षि -> क् (Deletes ष + ि , ie two unicode characters instead of one) प्रे -> प् (Deletes र + े , ie two unicode characters instead of one) Points to note: Some IMEs may provide non-normalized input for characters such as फ़ (U+095E) in place of फ (U+092B) + ़ (U+093C), ढ़ (U+095D) in place of ढ (U+0922) + ़ (U+093C) etc. In such cases, the user may expect that pressing a backspace will only eliminate the diacritic, not the entire grapheme. So, VE may have to handle normalization in such cases. Results seem to indicate that halant is partially correctly handled. letter + halant + letter + backspace gives letter + halant correctly. But letter + halant + backspace, instead of giving the letter, deletes the entire grapheme. The remaining diacritics as of unicode 3.0 come under Nonspacing mark (Mn) and Spacing combining mark (Mc) (Note: This does not include devanagari extended added in unicode 6.0 and vedic extensions added in unicode 6.1) -- You are receiving this mail because: You are on the CC list for the bug. _______________________________________________ Wikibugs-l mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/wikibugs-l
