Hi Chris,

On 20 November 2018 03:39:02 GMT+05:30, Chris Koerner <[email protected]> 
wrote:
>
>== Did you know? ==

Thanks for the informative did you know section. It was an interesting read. :-)


>* Letters are encoded internally by computers as numbers—for example,
>“A” is 65 and “a” is 97.[3] Years ago, programs and even websites
>would use different encodings[4] to represent text, often leading to
>unreadable gibberish on screen. Unicode[5] was intended to be a single
>encoding for most of the world’s writing systems. The most-used parts
>of it fit into a 16-bit representation,[6] which can handle about 65
>thousand characters. But that's not enough for the very large number
>of rare and historical Chinese, Japanese, and Korean (CJK) characters,
>which are represented in 16-bit Unicode using “surrogate pairs”.[7]
>1,024 Unicode characters are set aside to be “high surrogates”—the
>first half of a 32-bit character—and 1,024 characters are set aside to
>be “low surrogates”—the second half. By themselves, the surrogates
>aren’t valid and don’t represent anything, but in pairs they can
>represent over a million additional characters. Since these characters
>are usually rare, software can sometimes treat them incorrectly split
>them up, which can result in you seeing the Unicode replacement
>character �,[8] which is used when something has gone wrong processing
>Unicode text. (When the character is fine, but you don’t have a font
>to show it, you sometimes get little squares instead. Since the most
>common source of these squares for English speakers is unrepresented
>CJK characters, a slang term for the squares is “tofu”.[9])
>
>[0] https://phabricator.wikimedia.org/T168427
>[1] https://phabricator.wikimedia.org/T209293
>[2] https://phabricator.wikimedia.org/T209156
>[3] https://en.wikipedia.org/wiki/ASCII#Printable_characters
>[4]
>https://en.wikipedia.org/wiki/Character_encoding#Common_character_encodings
>[5] https://en.wikipedia.org/wiki/Unicode
>[6] https://en.wikipedia.org/wiki/UTF-16
>[7]
>https://en.wikipedia.org/wiki/Universal_Character_Set_characters#Surrogates
>[8]
>https://en.wikipedia.org/wiki/Specials_(Unicode_block)#Replacement_character
>[9] https://en.wiktionary.org/wiki/tofu#Noun
>


-- 
Sivaraam

Sent from my Android device with K-9 Mail. Please excuse my brevity.

_______________________________________________
Wikitech-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Reply via email to