On Wed, Jan 03, 2007 at 11:53:55AM +0200, Jean Jordaan wrote: > Hi there > > I'd like to find the encoding of an XML document, as detected by > libxml2, using the Python bindings. From lxml, I can get it like this: > > >>> et > <etree._ElementTree object at 0xb7cc992c> > >>> et.docinfo.encoding > 'windows-1252' > > According to the lxml API docs, lxml gets this information from libxml2 (see > http://codespeak.net/lxml/api.html#parsers ) > > How do I get at it without depending on lxml? The only way I've been > able to find is using debugDumpDocumentHead, which just prints to > stdout. > > >>> dh = xml.debugDumpDocumentHead(xml) > DOCUMENT > version=1.0 > encoding=windows-1252 > standalone=true
Hum, it's a string attached to the xmlDoc, it's available directly in C but there is no specific API to extract it. As a result the autogenerated bindings don't seems to have a way to extract the information. Could you add a bugzilla asking for that functionality, the simplest is probably to provide a custom accessor function, specifically at the python binding level. Daniel -- Red Hat Virtualization group http://redhat.com/virtualization/ Daniel Veillard | virtualization library http://libvirt.org/ [EMAIL PROTECTED] | libxml GNOME XML XSLT toolkit http://xmlsoft.org/ http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/ _______________________________________________ xml mailing list, project page http://xmlsoft.org/ [email protected] http://mail.gnome.org/mailman/listinfo/xml
