On Mon, Mar 31, 2008 at 09:45:49AM +0100, Julien Chaffraix wrote: > Hi everyone, > > I have an application that has to parse a "content" ( ::= (element | > CharData | Reference | CDSect | PI | Comment)* as specified in the > libxml documentation). > Currently we are using xmlParseBalancedChunkMemory to parse it but it > has induced code duplication (mainly due to the fact that we cannot > tune the behavior > with a xmlParserCtxt). > I am trying to find a replacement for that API that should match the > behaviour of xmlParseBalancedChunkMemory (we do not provide xmlDocPtr > and xmlNodePtr as we build the representation ourselves using SAX2 > callbacks). > Looking at the documentation, I found 3 candidates: > - xmlParseBalancedChunkMemoryRecover > - xmlParseInNodeContext > - xmlParseContext > > First, have I found all the candidates? (I am quite new to libxml so > it is likely that I have missed some)
That looks right to me. > Then, is there a way to choose between them so that I have a behavior > as close to xmlParseBalancedChunkMemory's as possible by providing a > well-crafted xmlParserCtxt (a pointer about which type to use / how to > initialize it would also be appreciated)? The problem is that what you are trying to do is not specified in the spec as a normal parsing for XML, all the spec defines is how to parse a document, not a subset. Since basically the spec is there for interopera- bility there is a good reason to try to force this, I consider this is normal except maybe for applications like editors. The fact that you use SAX make you request look a bit suspicious actually, your application seems to try to do something which is not interoperable, and not surprizing it's harder to do with existing APIs... The only other thing I could think of, would be for you to set up a complete parser context and call xmlParseContent(), then do the parser clanup in the end. It's really low level, requires more knowledge of the parser internals, but I guess it's the price to pay for an a priori non-conformant behaviour. There are many things which are contextual when parsing an XML fragment and you will have to recreate that context or you won't parse things properly (e.g. namespace). Daniel -- Red Hat Virtualization group http://redhat.com/virtualization/ Daniel Veillard | virtualization library http://libvirt.org/ [EMAIL PROTECTED] | libxml GNOME XML XSLT toolkit http://xmlsoft.org/ http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/ _______________________________________________ xml mailing list, project page http://xmlsoft.org/ [email protected] http://mail.gnome.org/mailman/listinfo/xml
