On Sat, Jan 27, 2007 at 12:34:21AM +0000, Mike Kneller wrote: > I am not sure if I have located a bug or not.... > > Using Python (2.4) and libxml2.2.6.22 > > When I load an document containing an entity, if I attempt to read > the value of a node containing an entity, I get the text content and > the entity disappears. > In the following example, when looking at root.content I would expect > to see '©2007', instead all I get is '2007'. > > I was advised on the #XML IRC channel to construct a simple test > case, so here it is: >
On IRC you said the entity was defined in the internal subset, it's not > File 1: test.xml > > <?xml version="1.0"?> > <!DOCTYPE content [ > <!ENTITY % HTMLlat1 PUBLIC > "-//W3C//ENTITIES Latin 1 for XHTML//EN" > "http://www.w3.org/TR/xhtml1/DTD/xhtml-lat1.ent"> > %HTMLlat1; > ]> > <content> > <p>©2007</p> > </content> libxml2 doesn't load externl subset by default, > File 2: testcase.py > > import libxml2 > sourcedoc = libxml2.parseFile( 'test.xml' ) > root = sourcedoc.getRootElement() > print root.serialize() > print root.content So your content element has 2 children an entity reference to copy whose content is unknown and the text node with "2007" > > Reading the source for libxml2.py, I find the following: > def getContent(self): > """Read the value of a node, this can be either the text > carried directly by this node if it's a TEXT node or the > aggregate string of the values carried by this node > child's (TEXT and ENTITY_REF). Entity references are > substituted. """ > ret = libxml2mod.xmlNodeGetContent(self._o) > return ret > > > Which in my (admittedly limited) understanding I would have thought > would return the translated entity as well as the text when I examine > root.content. > > Is this a bug, or am I doing something wrong? Not askling to load the external subset, use readFile and pass the XML_PARSE_DTDLOAD option. It should work even with the ancient version 2.6.22 , but please firtst upgrade first in case of problem. Daniel -- Red Hat Virtualization group http://redhat.com/virtualization/ Daniel Veillard | virtualization library http://libvirt.org/ [EMAIL PROTECTED] | libxml GNOME XML XSLT toolkit http://xmlsoft.org/ http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/ _______________________________________________ xml mailing list, project page http://xmlsoft.org/ [email protected] http://mail.gnome.org/mailman/listinfo/xml
