On Mon, Mar 29, 2010 at 10:25:30AM -0400, Ethan Tira-Thompson wrote:
> Hi Daniel, thanks for the feedback.
> 
> > [1]     document       ::=       prolog  element  Misc*
> > ...
> >  NEVER STACK XML DOCUMENTS
> 
> This is an unfortunate design decision.  I'm not going to close
> and reopen a network connection for each of a series of short and
> frequently-sent XML documents just so the parser can verify the eof,

  it's always a bit problematic to assume one is the first one to
ever design a protocol. I assume you're heard of XMPP aka Jabber
they solved this 10 years ago. Send everything as 1 document,
chunk by chunk, and close the top element when closing the connection.

> it's pedantic.  The parser knows the document (or at least the root
> node) has ended,

  it knows the top element has ended but there are various things
which may be pushed there after like comments or PIs

> and by default it makes sense to complain if there's
> extra characters afterward, but there should be a way to tell libxml
> to ignore it.  I'm not trying to claim my entire stream is a valid XML
> document,

  Stop using the wrong term, in XML context valid means that it passes
DTD validation. You mean well-formed here ...

> I only claim each of the documents in the stream is valid,
> and it would be nice to have better support for the situation.

  Since you're in research, I would suggest you read the 2 specification
governing the 
    XML-1.0 spec http://www.w3.org/TR/REC-xml/
    XMPP  http://tools.ietf.org/html/rfc3920 especially section 4

> The direct workaround, to make the IO read callback duplicate the
> parsing functionality of looking for the close tag for the root node,
> is error prone.  libxml is already doing this, I shouldn't have to
> reimplement this functionality myself.

  that's wrong that would mean mixing layers. Either carry the length
of each document as part of your protocol, or provide a marker which is
not a compatible XML char, or do it the Jabber way. But you will have
to tell the parser where the XML document(s) end.

> If nothing else, since you already have the save-as-fragment
> functionality, it's odd you don't also have the load-as-fragment... this
> situation also arises if you know you have some XML embedded in something
> else (maybe more XML, maybe not) and you just want to parse just that
> chunk for efficiency.

  See other mail, libxml2 does provide such routines.

Daniel

-- 
Daniel Veillard      | libxml Gnome XML XSLT toolkit  http://xmlsoft.org/
[email protected]  | Rpmfind RPM search engine http://rpmfind.net/
http://veillard.com/ | virtualization library  http://libvirt.org/
_______________________________________________
xml mailing list, project page  http://xmlsoft.org/
[email protected]
http://mail.gnome.org/mailman/listinfo/xml

Reply via email to