Re: [Sax-devel] Re: Xerces1,2 proposals

Glenn Marcy Tue, 27 Nov 2001 13:16:42 -0800

Here are some of the issues I see, after only a quick glance, with the
sax2-r2pre2 changes.  Most are fairly minor, but a few concern me.

Issue 1:

  The JavaDoc for DTDHandler has an @link to
org.xml.sax.ext.LexicalHandler.
  This would seem to imply that there would be a broken link if there
  were a core-only SAX2 package with docs.  Usually extensions reference
  the thing they extend, not vice-versa.

Issue 2:

In DTDHandler#notationDecl, the comments were changed from:

<      * <p>It is up to the application to record the notation for later
<      * reference, if necessary.</p>
<      *
<      * <p>At least one of publicId and systemId must be non-null.
<      * If a system identifier is present, and it is a URL, the SAX
<      * parser must resolve it fully before passing it to the
<      * application through this event.</p>

to:

>      * <p>It is up to the application to record the notation for later
>      * reference, if necessary;
>      * notations may appear as attribute values and in unparsed entity
>      * declarations, and are sometime used with processing instruction
>      * target names.
>      * When a system identifier is present, applications are responsible
>      * for knowing if it is used as a URL, and absolutizing it against
>      * the appropriate URI when appropriate.
>      * That base URI is available from {@link Locator#getSystemId} during
>      * this callback, assuming the parser provides a Locator.</p>
>      *
>      * <p>At least one of publicId and systemId must be non-null. </p>

Issue 2a:

The behavior of the SAX Parser w.r.t. processing the systemId has been
changed to something completely different.  While I appreciate that the
change is toward a behavior that I consider more appropriate, the fact
remains that this is not what the docs, i.e. spec, used to say.

Issue 2b:

ContentHandler#setDocumentLocator says:

>      * <p>Note that the locator will return correct information only
>      * during the invocation of the events in this interface.  The
>      * application should not attempt to use it at any other time.</p>

The addition above to DTDHandler#notationDecl of:

>      * That base URI is available from {@link Locator#getSystemId} during
>      * this callback, assuming the parser provides a Locator.</p>

contradicts the long-standing, i.e. back to DocumentHandler, restriction
that
a Locator is only valid during DocumentHandler/ContentHandler callbacks.

Issue 3:

The InputSource JavaDoc was changed from:

<  * <p>The SAX parser will use the InputSource object to determine how
<  * to read XML input.  If there is a character stream available, the
<  * parser will read that stream directly; if not, the parser will use
<  * a byte stream, if available; if neither a character stream nor a
<  * byte stream is available, the parser will attempt to open a URI
<  * connection to the resource identified by the system
<  * identifier.</p>
<  *
<  * <p>An InputSource object belongs to the application: the SAX parser
<  * shall never modify it in any way (it may modify a copy if
<  * necessary).</p>

to:

>  * <p>The SAX parser will use the InputSource object to determine how
>  * to read XML input.  If there is a character stream available, the
>  * parser will read that stream directly, ignoring any text encoding
>  * declaration found in that stream as well as any encoding specified
>  * in the InputSource.  If there is no character stream, but there is
>  * a byte stream, the parser will use that byte stream, using the
>  * encoding specified in the InputSource or else (if no encoding is
>  * specified) autodetecting the character encoding using an algorithm
>  * such as the one in the XML specification.  If neither a character
>  * stream nor a
>  * byte stream is available, the parser will attempt to open a URI
>  * connection to the resource identified by the system
>  * identifier.</p>
>  *
>  * <p>An InputSource object belongs to the application: the SAX parser
>  * shall never modify it in any way (it may modify a copy if
>  * necessary).  However, standard processing of both byte and
>  * character streams is to close them on as part of end-of-parse cleanup,
>  * so applications should not attempt to re-use such streams after they
>  * have been handed to a parser.  </p>

Issue 3a:

The phrase "ignoring ... any encoding specified in the InputSource." is
not relevant.  This gives the misimpression that the encoding should
not be set for the InputSource when a character stream, i.e. Reader,
is used.  This is in fact the correct way for the application to pass
the "actualEncoding" property to the parser, since the application might
be passing a byte stream wrapped by an InputStreamReader to the parser
as a character stream and the actual encoding is otherwise not available.

Issue 3b:

The comment about "standard processing of both byte and character streams"
is nothing of the sort.  Assigning some sort of legitimacy to such behavior
is ridiculous at this point.  We have had occasional bug reports for years
when someone finds the Xerces parser closing their streams and they have
always been accepted as bugs.  To add a comment allowing for a behavior
that has always been implied to be disallowed is a significant change.

Regards,
Glenn



                                                                                       
                   
                    David Brownell                                                     
                   
                    <david-b@pacbe       To:     [EMAIL PROTECTED]           
                   
                    ll.net>              cc:     [EMAIL PROTECTED]       
                   
                                         Subject:     Re: [Sax-devel] Re: Xerces1,2 
proposals             
                    11/26/2001                                                         
                   
                    08:25 PM                                                           
                   
                    Please respond                                                     
                   
                    to                                                                 
                   
                    xerces-j-dev                                                       
                   
                                                                                       
                   
                                                                                       
                   



Right, [EMAIL PROTECTED] is where such issues should
be raised.  So far I've not heard a peep from Xerces folk, and it was
on my list to find out why that's so.

The current stuff is "sax2 r2pre3", and as Edwin said this is intended
to be "no semantic changes".  It's more recent than the "r2pre1" code
(with changes) that Xerces now bundles, and includes fixes for bugs
that folk have reported -- some of which include documentation, in
most cases clarifications needed.

Note that there are also some JUNIT tests I sent out a few weeks
back, testing against the behavior that's long been documented.
Xerces didn't do as well there as I'd have hoped.  I put those tests
onto the http://xmlconf.sourceforge.net/ download site since they're
GPL'd, not public domain.  The tests are still out for comment, but
the main issues I noticed with Xerces (1.4.3) are clear bugs.

- Dave

p.s. I managed to dig up archives of xerces-j-dev and was glad to
    notice that the "incompatible change" was for schema defaulting,
    not the SAX support!


----- Original Message -----
From: "Edwin Goei" <[EMAIL PROTECTED]>
To: <[EMAIL PROTECTED]>
Cc: <[EMAIL PROTECTED]>
Sent: Monday, November 26, 2001 4:33 PM
Subject: [Sax-devel] Re: Xerces1,2 proposals: one w/ potential incompatible
change


> Arnaud Le Hors wrote:
> >
> > Although I don't have the specific details handy, some of the changes
> > introduced in the version of SAX available on sourceforge are
controversial.
> > Not only some bugs in the code have been fixed, the javadoc which
serves as
> > the "spec" has been changed in ways that change the semantics of the
> > methods. I would therefore stay away from embracing this beta release
for
> > now, until these issues get addressed. I have no pb patching known bugs
in
> > the code though.
>
> My understanding is that the javadoc changes in the SAX code you are
> refering to are clarifications and not intended to be sematic changes to
> the SAX spec.  I would encourage those who have issues with the changes
> to discuss this on the [EMAIL PROTECTED] mailing list.  See
> http://saxproject.org for more info.  Also, I believe that sax2r2 is
> supposed to be a bugfix release of SAX 2.0 and not one that includes any
> new features.  There is a separate effort for a new feature release of
> SAX.
>
> -Edwin
>
> _______________________________________________
> sax-devel mailing list
> [EMAIL PROTECTED]
> https://lists.sourceforge.net/lists/listinfo/sax-devel


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]






---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
Re: [Sax-devel] Re: Xerces1,2 proposals

Reply via email to