On Mon, Mar 29, 2010 at 09:21:56PM -0400, Ethan Tira-Thompson wrote:
> Thanks for all the information, I'll try to collate things :)
> > Failure to do so would just make the parser non-conformant to the XML-1.0 
> > specification.
> 
> Are you sure about this?  Like I said, I'm not aware of the
> specification that it must be an error if more data follows the document.
> The spec does defines this extra data is not part of the document,
> but AFAIK not what you should do with/about it.  It would better serve
> interoperability to simply ignore it and let the user decide if it's an
> issue, probably issuing a warning by default.  But I'm no expert on the
> spec, it would be educational if you could point me to the section.

  You get things backward, read the spec:

http://www.w3.org/TR/REC-xml/#sec-documents

"Each XML document has both a logical and a physical structure. 
Physically, the document is composed of units called entities. An
entity may refer to other entities to cause their inclusion in the
document."

An entity is basically a file. In your case there is only one entity
as you are not loading any external entity.

Now comes the definition of Well-Formed XML Documents

http://www.w3.org/TR/REC-xml/#sec-well-formed

"
[Definition: A textual object is a well-formed XML document if:] 

1. Taken as a whole, it matches the production labeled document.

2. It meets all the well-formedness constraints given in
this specification.

3. Each of the parsed entities which is
referenced directly or indirectly within the
document is well-formed.

[1]     document       ::=       prolog  element  Misc*
"

so the definition is based on

 you give a textual object and the processor tells you whether it's
 well formed.

In that case you feed the entity content, and the processor will parse
it. If it find a second root element you get a fatal error and the
*whole* is a not well formed document.

You can make all the theories about how the processor could just ignore
thinsg or stop at a given point, it's just not how the spec says an
XML processor must be implemented. You will note the

  "taken as a whole"

clearly indicating it's absolutely forbidden to stop applying the rules
at some point.
You feed the XML parser what the entiti(es) contains and it provides a
result back. If there is an error in the middle or the end, it
invalidates the whole document.

Daniel

-- 
Daniel Veillard      | libxml Gnome XML XSLT toolkit  http://xmlsoft.org/
[email protected]  | Rpmfind RPM search engine http://rpmfind.net/
http://veillard.com/ | virtualization library  http://libvirt.org/
_______________________________________________
xml mailing list, project page  http://xmlsoft.org/
[email protected]
http://mail.gnome.org/mailman/listinfo/xml

Reply via email to