----- Original Message -----
From: "Andy Clark" <[EMAIL PROTECTED]>
To: <[EMAIL PROTECTED]>
Sent: Monday, July 02, 2001 11:42 PM
Subject: [Xerces2] Design Decisions (LONG)
> This list is eerily quiet these days in regards to Xerces2
> design and development. Hmmm... Well, I'm going to raise a
> few more design points and then make some unilateral
> decisions unless there's some discussion or objections.
>
> [1] DTD Handler Interfaces
>
> We've had some discussion but no resolution on what is to
> become of the XMLDTDHandler and XMLDTDContentModelHandler
> interfaces. Obviously the right balance of information and
> usefulness is beyond our reach.
>
> I'm now leaning towards Glenn's earlier suggestion that we
> can provide DTD information needed to DTD editor writers
> (arguably a very small percentage of the parser user base)
> via the SAX xml-string property. I'm not suggesting that
> we *will* implement this before rolling out Xerces2, merely
> that we *can* support communicating more information in
> the future through this (or a similar) mechanism.
>
> I'm still torn between 1 and 2 interfaces so I'm gonna
> stay with 2 separate interfaces. However, I would make
> the following changes to the XMLDTDContentModelHandler
> interface, in an attempt to provide a better callback
> interleaving with start/endEntity.
>
> public interface XMLDTDContentModelHandler {
>
> public static final short OCCURS_ZERO_OR_MORE = 0;
> public static final short OCCURS_ZERO_OR_ONE = 1;
> public static final short OCCURS_ONE_OR_MORE = 2;
>
> public static final short SEPARATOR_CHOICE = 3;
> public static final short SEPARATOR_SEQUENCE = 4;
>
> public void startContentModel(String elementName) // *
> throws XNIException;
>
> public void any() throws XNIException; // +
> public void empty() throws XNIException; // +
>
> public void startGroup() throws XNIException; // rename
> public void pcdata() throws XNIException; // +
> public void element(String name) throws XNIException; // rename
> public void occurs(short occurs) throws XNIException; // rename
> public void separator(short separator) throws XNIException; //
> rename
> public void endGroup() throws XNIException; // rename
>
> public void endContentModel() throws XNIException;
>
> } // interface XMLDTDContentModelHandler
-0 I still think that editor writing is out of the scope of the parser, but
I
may be the only one who thinks this.
> [2] Pass Base URI to startEntity Callback
>
> In the continuing effort to pass as much information via
> XNI as possible, I would suggest passing the base systemId
> when calling the startEntity method in all of the handlers
> that define this method.
>
> Therefore, the method would have the following prototype:
>
> public void startEntity(String name,
> String publicId, String systemId,
> String baseSystemId, // +
> String detectedEncoding)
> throws XNIException;
>
> For convenience, would it be useful to also pass the
> expanded systemId as well
We should only pass the expanded systemId if there's no way to construct
it from the rest of the data -- which there is in this case.
> [3] More Parser Interfaces
>
> One of the problems I've run into (and I've mentioned
> this before but I'll describe it again) is that while
> we have a way to construct parser pipelines, we don't
> have a way of actually initiating the pipeline in a
> generic fashion.
>
> In short, as Ted suggested, it would be cool if we
> could define a pipeline dynamically. We can do most of
> this today *except* for actually telling the scanner to
> start scanning the input. Without this we can't swap
> scanners arbitrarilly. And I want/need to be able to
> do this! I don't want to have to rewrite an entire
> parser configuration just to change from the fully
> conformant scanner to a stripped-down "lite" scanner.
>
> Which means that I need to define a new interface for
> the document and DTD scanners. So here's a thought:
>
> public interface XMLDocumentScanner {
> public void startDocument(InputSource source)
> throws IOException;
> public boolean scanDocument(boolean complete)
> throws XNIException, IOException;
> }
>
> public interface XMLDTDScanner {
>
> public void startInternalSubset(InputSource source)
> throws IOException;
> public boolean scanInternalSubset(boolean complete,
> boolean standalone,
> boolean hasExternalDTD)
> throws XNIException, IOException;
>
> public void startExternalSubset(InputSource source)
> throws IOException;
> public boolean scanExternalSubset(boolean complete)
> throws XNIException, IOException;
>
> } // interface XMLDTDScanner
>
> Right now the reference implementation of the scanners
> depends on having an XMLEntityManager to handle the
> entities and accessing an entity scanner capable of
> tokenizing the lowest level input. So there is no
> need to pass the input source to the scanners because
> they get them from the entity manager. Of course,
> making this kind of change impacts the scanners
> directly. And I still don't have my head around how
> this changes things.
Can we break this out as a separate discussion. I'd really like to
see this work, and I think I have some time to put towards making
this work.
> [4] Remove Dependence on SAX
>
> I kept the most controversial 'til last so that I only
> get feedback from the real hardcore contributors. Or
> from people who have nothing better to do than to read
> really long posts on the mailing list... :)
>
> Seriously, a point was raised at the Xerces2 Workshop: why
> *do* we use the SAX stuff but only where appropriate? Why
> not either extend SAX or just remove all dependence on SAX
> and make a completely standalone set of interfaces?
> Extending SAX raises a lot of problems that I won't go
> into here but let's suffice it to say that it's not the
> best option.
>
> Since the Workshop I've been starting to agree with the
> argument and want to remove dependence on all other APIs
> from XNI. Therefore, I would suggest that we remove all
> use of SAX throughout XNI. That means that we would
> invent our own entity resolver, input source, etc. We
> can make them look an awful lot like SAX so that it
> bridges the learning gap between the two, though.
>
> Diverging from SAX also lets us fix the problems that
> SAX has. For example, there was a recent suggestion on
> the xml-dev mailing list about extending SAX in the
> future so that the entity resolver is passed the base
> URI as well to allow the resolver to do more. I'm all
> for this parameter. And I'm sure that there are more
> instances like this.
>
> So here's a proposal (in an abbreviated form):
>
> public class XMLInputSource {
>
> protected String fPublicId;
> protected String fSystemId;
> protected String fBaseSystemId; // +
> protected String fExpandedSystemId; // +
>
> protected String fEncoding;
>
> protected InputStream fByteStream;
> protected Reader fCharStream;
>
> } // class XMLInputSource
>
> public class XNIParseException extends XNIException {
> protected XMLLocator fLocation;
> }
>
> public interface XMLLocator {
>
> public String getPublicId();
> public String getSystemId();
> public String getBaseSystemId(); // +
> public String getExpandedSystemId(); // +
>
> public int getLineNumber();
> public int getColumnNumber();
>
> } // interface XMLLocator
>
> public interface XMLErrorHandler {
>
> public void warning(XNIParseException ex) throws XNIException;
> public void error(XNIParseException ex) throws XNIException;
> public void fatalError(XNIParseException ex) throws XNIException;
>
> } // interface XMLErrorHandler
>
> public interface XMLEntityResolver {
> public XMLInputSource resolveEntity(String publicId,
> String systemId,
> String baseSystemId) // +
> throws XNIException, IOException;
> }
>
> Am I missing anything?
+0. Cleanliness on this is okay with me.
> [*] Start the Discussion!
>
> Here's where you add your 2 cents (or yen, etc). I'll keep
> this topic open throughout this week and then, barring any
> problems, I'll start enacting the changes next week.
>
> --
> Andy Clark * IBM, TRL - Japan * [EMAIL PROTECTED]
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
>
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]