I also always really enjoy these, thanks!đ Am Mi., 21. Nov. 2018 um 04:41 Uhr schrieb Kaartic Sivaraam < [email protected]>:
> Hi Chris, > > On 20 November 2018 03:39:02 GMT+05:30, Chris Koerner < > [email protected]> wrote: > > > >== Did you know? == > > Thanks for the informative did you know section. It was an interesting > read. :-) > > > >* Letters are encoded internally by computers as numbersâfor example, > >âAâ is 65 and âaâ is 97.[3] Years ago, programs and even websites > >would use different encodings[4] to represent text, often leading to > >unreadable gibberish on screen. Unicode[5] was intended to be a single > >encoding for most of the worldâs writing systems. The most-used parts > >of it fit into a 16-bit representation,[6] which can handle about 65 > >thousand characters. But that's not enough for the very large number > >of rare and historical Chinese, Japanese, and Korean (CJK) characters, > >which are represented in 16-bit Unicode using âsurrogate pairsâ.[7] > >1,024 Unicode characters are set aside to be âhigh surrogatesââthe > >first half of a 32-bit characterâand 1,024 characters are set aside to > >be âlow surrogatesââthe second half. By themselves, the surrogates > >arenât valid and donât represent anything, but in pairs they can > >represent over a million additional characters. Since these characters > >are usually rare, software can sometimes treat them incorrectly split > >them up, which can result in you seeing the Unicode replacement > >character ïżœ,[8] which is used when something has gone wrong processing > >Unicode text. (When the character is fine, but you donât have a font > >to show it, you sometimes get little squares instead. Since the most > >common source of these squares for English speakers is unrepresented > >CJK characters, a slang term for the squares is âtofuâ.[9]) > > > >[0] https://phabricator.wikimedia.org/T168427 > >[1] https://phabricator.wikimedia.org/T209293 > >[2] https://phabricator.wikimedia.org/T209156 > >[3] https://en.wikipedia.org/wiki/ASCII#Printable_characters > >[4] > > > https://en.wikipedia.org/wiki/Character_encoding#Common_character_encodings > >[5] https://en.wikipedia.org/wiki/Unicode > >[6] https://en.wikipedia.org/wiki/UTF-16 > >[7] > > > https://en.wikipedia.org/wiki/Universal_Character_Set_characters#Surrogates > >[8] > > > https://en.wikipedia.org/wiki/Specials_(Unicode_block)#Replacement_character > >[9] https://en.wiktionary.org/wiki/tofu#Noun > > > > > -- > Sivaraam > > Sent from my Android device with K-9 Mail. Please excuse my brevity. > > _______________________________________________ > Wikitech-l mailing list > [email protected] > https://lists.wikimedia.org/mailman/listinfo/wikitech-l -- Charlie Kritschmar UX-Design/Research Wikimedia Deutschland e. V. | Tempelhofer Ufer 23-24 | 10963 Berlin Phone: +49 (0)30 219 158 26-0 http://wikimedia.de Imagine a world, in which every single human being can freely share in the sum of all knowledge. Thatâs our commitment. Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V. Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 B. Als gemeinnĂŒtzig anerkannt durch das Finanzamt fĂŒr Körperschaften I Berlin, Steuernummer 27/029/42207. _______________________________________________ Wikitech-l mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/wikitech-l
