On Wed, Dec 09, 2009 at 08:36:59AM -0800, Aaron Patterson wrote: > On Wed, Dec 9, 2009 at 7:54 AM, Daniel Veillard <[email protected]> wrote: > > On Sat, Dec 05, 2009 at 11:03:26AM -0800, Aaron Patterson wrote: > >> Hey everyone, > >> > >> It looks like sometimes there is unexpected behavior when parsing with > >> XML_PARSE_NOBLANKS. It seems that sometimes blank nodes will get > >> included in the resulting tree. I don't think this is expected > > > > If libxml2 detected a non-blank text node at the same level it > > will keep all further text nodes, assuming a mixed content element. > > Understood. Thank you!
This tend to surprize people but since blank node elimination without having read the DTD is a pure heuristic, the parser try to be as safe as possible (though it's not possible to go back on nodes already parsed). In general XML_PARSE_NOBLANKS is a deviation from the normal parsing behaviour so in general I suggest to avoid it and just ignore the nodes you know are purely formatting, the parser can't guess it 100% Daniel -- Daniel Veillard | libxml Gnome XML XSLT toolkit http://xmlsoft.org/ [email protected] | Rpmfind RPM search engine http://rpmfind.net/ http://veillard.com/ | virtualization library http://libvirt.org/ _______________________________________________ xml mailing list, project page http://xmlsoft.org/ [email protected] http://mail.gnome.org/mailman/listinfo/xml
