--- Comment #8 from Platonides <platoni...@gmail.com> 2010-02-12 23:48:54 UTC
(In reply to comment #7)
> >Java internally uses UTF-16
> yes it does, but i think the file is interperted as utf-8, otherwise it
> wouldn't be able to make sense of it at all, as utf-8 and utf-16 look fairly
> different for your average english text (I'm under the impression that utf-16
> is not compatible with ASCII thus nothing would work at all if it was using
Right. But it could be overflowing the 16-bit or some other failure.
> >I don't see why it is reading a U+26 (100110).
> The entity references that come after the problematic unicode character is
> where the U+26 (&) comes from.
Interesting. Saving from firefox produced a literal " in the output.
> I'm thinking this is a bug with the underlying java libraries, as opposed to
I also think so.
Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are watching all bug changes.
Wikibugs-l mailing list