On Wed, Feb 23, 2005 at 09:26:09AM +1100, Malcolm Tredinnick wrote: > On Tue, 2005-02-22 at 23:26 +0200, Bar Gam wrote: > > Hello > > Hi :) > > > > > If I try to parse a document encoded in iso-8859-8 - should it be > > converted to UTF-8, or is it supported and handled by the parser on > > the fly? If the content should be converted (and deconverted) - what > > method should be used in this > > ?case > > Providing the document encoding is correctly specified and providing you > have Iconv support compiled in, the conversion to UTF-8 will be done for > you automatically as libxml2 parses the document. > > If the document encoding is not specified in the xml declaration at the > top of the file (<?xml encoding="..."?>), there is a way to pass it in > directly when using the libxml API -- this is needed because of the way > HTTP documents have their encoding specified, for example. But I cannot > remember the exact call off the top of my head. > > You can see if you have Iconv support available by looking at the output > of 'xmllint --version'.
If iconv is not present all the iso-8859-[1-15] are compiled in the library by default so unless a very specific setup the conversion will be supported. Anyway if the encoding is not supported, per the spec it's a fatal error and the parser fail immediately and deliver no data. Daniel -- Daniel Veillard | Red Hat Desktop team http://redhat.com/ [EMAIL PROTECTED] | libxml GNOME XML XSLT toolkit http://xmlsoft.org/ http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/ _______________________________________________ xml mailing list, project page http://xmlsoft.org/ [email protected] http://mail.gnome.org/mailman/listinfo/xml
