https://bugzilla.wikimedia.org/show_bug.cgi?id=13721


Brion Vibber <[email protected]> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |[email protected]




--- Comment #5 from Brion Vibber <[email protected]>  2009-10-20 00:09:12 UTC ---
Ahhhh ok I think I see the base issue -- if a 2-byte or 3-byte char is cut off
at the 255-byte boundary when stored, it becomes an invalid char. The XML dump
outputter runs UTF-8 validation and turns the bad char into a valid U+FFFD ...
which is 3 bytes of UTF-8, over the 255-char limit again.

Yeah, this should be fixed in our DB and MediaWiki should be smarter about
truncation, but in the meantime it should be easy to make mwdumper smarter for
this too.


-- 
Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.

_______________________________________________
Wikibugs-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l

Reply via email to