--- Comment #8 from Platonides <> 2010-02-12 23:48:54 UTC 
(In reply to comment #7)
> >Java internally uses UTF-16
> yes it does, but i think the file is interperted as utf-8, otherwise it
> wouldn't be able to make sense of it at all, as utf-8 and utf-16 look fairly
> different for your average english text (I'm under the impression that utf-16
> is not compatible with ASCII thus nothing would work at all if it was using
> utf-16). 

Right. But it could be overflowing the 16-bit or some other failure.

> >I don't see why it is reading a U+26 (100110).
> The entity references that come after the problematic unicode character is
> where the U+26 (&) comes from.
Interesting. Saving from firefox produced a literal " in the output.

> I'm thinking this is a bug with the underlying java libraries, as opposed to
> mwdumper
I also think so.

Configure bugmail:
------- You are receiving this mail because: -------
You are watching all bug changes.

Wikibugs-l mailing list

Reply via email to