s/they/emojis/

On Fri, Oct 15, 2021 at 2:12 PM Jaime Crespo <[email protected]> wrote:

> I don't want to defend MySQL development decisions- in fact PHP made some
> similarly bad ones, but it would be unfair to judge them too harsly with
> the "power of hindsight" [0]- but... /pedantic on
>
> On Thu, Oct 14, 2021 at 7:37 PM Roy Smith <[email protected]> wrote:
>
>> What part of "universal" did they not understand?
>>
>
> ... several years ago, during the end of the century/start of a new one,
> no one used UTF-8 [1] and PHP didn't even support multi-byte strings. The
> original spec for UTF-8 called for up to 6 bytes[2]. The BMP, however (3
> bytes) contained characters for most modern languages [3], which was a
> waste of space and performance because at the time, MySQL worked much
> faster with fixed-width columns, which would be a waste of space (double!).
> My guess is that someone said "this is probably good enough", and would it
> be too outrageous to think that we may not need as many extra characters as
> stars in our Galaxy, when less than 65K were practically needed?
>
> 3 things changed after that:
> * Unicode limited UTF-8 to encoding for 21 bits in 2003 [4], requiring
> only 4 bytes- only one more than on MySQL's utf8
> * Apple wanted to sell iPhones in Japan, so they were added to unicode in
> 2010, and its subsequent popularity
> * MySQL/InnoDB has been highly optimized for the fast handling of
> variable-length strings
>
> However, you cannot just arbitrarily break backwards compatibility and
> rename the meaning of configuration- specially with storage software that
> has been continuously supporting incremental upgrades as long as I can
> remember. You can just support the new standard and encourage its usage,
> make it the default, etc.
>
> This is a bit offtopic here (feel free to PM to continue the
> conversation), and just to be clear, I am _not fully justifying the
> decisions_, just giving historical context, but I want to end with some
> relevant lessons to the list:
>
> * It is very difficult to build future-proof applications- PHP, MySQL,
> Mediawiki, they have a long history and we should be gentle when we judge
> them from the future. My work, involving backups, makes sometimes
> supporting storage of stuff for over 5 years (unchanged) challenging,
> because encryption algorithms are found to be weak, or end up being
> unsupported/unavailable in just 2 releases of the operating system!
> * Standards also change, they are not as "universal" as we may want to
> believe (there have been 32 extra unicode versions since 1991). I expect
> new collations to be needed in the future that are currently not
> implemented, too.
> * It is ok to make "mistakes", as long as we learn from them and improve
> upon them :-)
>
> Sorry for the text block.
>
> [0] <url:https://powerlisting.fandom.com/wiki/Hindsight>
> [1] <url:https://commons.wikimedia.org/wiki/File:Utf8webgrowth.svg>
> [2] <url:https://www.rfc-editor.org/rfc/rfc2279>
> [3] <url:
> https://en.wikipedia.org/wiki/Plane_(Unicode)#Basic_Multilingual_Plane>
> [4] <url:https://www.rfc-editor.org/rfc/rfc3629>
>
>

-- 
Jaime Crespo
<http://wikimedia.org>
_______________________________________________
Wikitech-l mailing list -- [email protected]
To unsubscribe send an email to [email protected]
https://lists.wikimedia.org/postorius/lists/wikitech-l.lists.wikimedia.org/

Reply via email to