On Tue, May 03, 2005 at 10:37:07PM -0700, Abraham Nelson wrote:
> The code for the file page.html is taken from the site
> www.teleworking.gr

> What happens is that the file is parsed properly, but
> when trying to dump it an error occurs:
> 
>    output conversion failed due to conv error
>    Bytes: 0xCE 0xCE 0xCE 0xCE
>    I/O error : encoder error
> 
> I find this error odd, since I've specified the same
> output encoding as what the tree is.

paphio:~/XML -> xmllint --html http://www.teleworking.gr/
http://www.teleworking.gr/:97: HTML parser error : Unexpected end tag : style
        css += '</style>\n'
                        ^
http://www.teleworking.gr/:130: HTML parser error : Unexpected end tag : a
idth="80"><img border="0" src="images/home_space.gif" width="80" 
height="1"></a>                                                                 
              ^output conversion failed due to conv error
Bytes: 0xCE 0xCE 0xCE 0xCE
I/O error : encoder error
http://www.teleworking.gr/:19: element script: error : String is not UTF-8
  
There is a serious parsing failure in the script content. This seems to be the
source of the error, as the output from the command I pasted ends up in the
middle of the script itself.
The input HTML page is too broken w.r.t. the HTML spec to actually parse
correctly and be serialized correctly too.

Daniel

-- 
Daniel Veillard      | Red Hat Desktop team http://redhat.com/
[EMAIL PROTECTED]  | libxml GNOME XML XSLT toolkit  http://xmlsoft.org/
http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/
_______________________________________________
xml mailing list, project page  http://xmlsoft.org/
[email protected]
http://mail.gnome.org/mailman/listinfo/xml

Reply via email to