I also always really enjoy these, thanks!🐙

Am Mi., 21. Nov. 2018 um 04:41 Uhr schrieb Kaartic Sivaraam <
[email protected]>:

> Hi Chris,
>
> On 20 November 2018 03:39:02 GMT+05:30, Chris Koerner <
> [email protected]> wrote:
> >
> >== Did you know? ==
>
> Thanks for the informative did you know section. It was an interesting
> read. :-)
>
>
> >* Letters are encoded internally by computers as numbers—for example,
> >“A” is 65 and “a” is 97.[3] Years ago, programs and even websites
> >would use different encodings[4] to represent text, often leading to
> >unreadable gibberish on screen. Unicode[5] was intended to be a single
> >encoding for most of the world’s writing systems. The most-used parts
> >of it fit into a 16-bit representation,[6] which can handle about 65
> >thousand characters. But that's not enough for the very large number
> >of rare and historical Chinese, Japanese, and Korean (CJK) characters,
> >which are represented in 16-bit Unicode using “surrogate pairs”.[7]
> >1,024 Unicode characters are set aside to be “high surrogates”—the
> >first half of a 32-bit character—and 1,024 characters are set aside to
> >be “low surrogates”—the second half. By themselves, the surrogates
> >aren’t valid and don’t represent anything, but in pairs they can
> >represent over a million additional characters. Since these characters
> >are usually rare, software can sometimes treat them incorrectly split
> >them up, which can result in you seeing the Unicode replacement
> >character ïżœ,[8] which is used when something has gone wrong processing
> >Unicode text. (When the character is fine, but you don’t have a font
> >to show it, you sometimes get little squares instead. Since the most
> >common source of these squares for English speakers is unrepresented
> >CJK characters, a slang term for the squares is “tofu”.[9])
> >
> >[0] https://phabricator.wikimedia.org/T168427
> >[1] https://phabricator.wikimedia.org/T209293
> >[2] https://phabricator.wikimedia.org/T209156
> >[3] https://en.wikipedia.org/wiki/ASCII#Printable_characters
> >[4]
> >
> https://en.wikipedia.org/wiki/Character_encoding#Common_character_encodings
> >[5] https://en.wikipedia.org/wiki/Unicode
> >[6] https://en.wikipedia.org/wiki/UTF-16
> >[7]
> >
> https://en.wikipedia.org/wiki/Universal_Character_Set_characters#Surrogates
> >[8]
> >
> https://en.wikipedia.org/wiki/Specials_(Unicode_block)#Replacement_character
> >[9] https://en.wiktionary.org/wiki/tofu#Noun
> >
>
>
> --
> Sivaraam
>
> Sent from my Android device with K-9 Mail. Please excuse my brevity.
>
> _______________________________________________
> Wikitech-l mailing list
> [email protected]
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l



-- 
Charlie Kritschmar
UX-Design/Research

Wikimedia Deutschland e. V. | Tempelhofer Ufer 23-24 | 10963 Berlin
Phone: +49 (0)30 219 158 26-0
http://wikimedia.de

Imagine a world, in which every single human being can freely share in the
sum of all knowledge. That‘s our commitment.

Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V.
Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter
der Nummer 23855 B. Als gemeinnĂŒtzig anerkannt durch das Finanzamt fĂŒr
Körperschaften I Berlin, Steuernummer 27/029/42207.
_______________________________________________
Wikitech-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Reply via email to