I am wondering whether the HTML actually says
it is UTF-8 or not. If it has a load of double byte
characters, but says it is some other 8-bit encoding,
then you'd get the situation you describe, I think.

Can you show us the original HTML?

--
Sebastian Rahtz
Information Manager, Oxford University Computing Services
13 Banbury Road, Oxford OX2 6NN. Phone +44 1865 283431

Sólo le pido a Dios
que el futuro no me sea indiferente
_______________________________________________
xml mailing list, project page  http://xmlsoft.org/
xml@gnome.org
http://mail.gnome.org/mailman/listinfo/xml

Reply via email to