RE: Empty tag handling question

Paul Norris 25 Aug 2003 00:26:32 -0000

Hi Benson,

The answer wasn't so complex, but I'm afraid it holds no joy for you.


startElement is called once the > character is processed.  All attributes 
within the tag are processed before startElement is called so that a 
complete list can be passed to that method.

An ErrorHandler method (ie warning, error or fatalError) is called as soon 
as the exception is encountered.  Thus, if the exception occurs outside the 
element tag (ie after the >), then startElement will be called before the 
exception.  If the exception is inside the tag then the exception method 
will be called before startElement.

The real stopper is that neither SAXException or SAXParseException provide 
enough information for you to tell wether the exception occurred inside an 
element tag or outside it.  You cannot distinguish between this:
<OkayElement>
  Content causes error
  <SecondOkayElement/>
</OkayElement>

and this:
<OkayElement>
  <ElementCausesError/>
</OkayElement>

In both cases, you will get the following sequence of events:
1) startElement called for OkayElement,
2) exception,
3) startElement for the inner tag (SecondOkayElement or ElementCausesError)
4) endElement for the inner tag
5) endElement for OkayElement

An ugly work-around would be to use the following approach:
1) Use a document locator to save the position in the document of every 
event.  Note that a document locator returns the END position of each piece 
of contant that causes an event.  If you want the beginning of an event 
(and you will), you must use the end position of the previous event (plus 
one).
2) When an exception occurs, save the exception object (it will be a 
SAXParseException).     
3) When you get a startElement event and the exception object is not null, 
compare the location of the tag with the location of the exception object.
4) If the exception is before the element tag then it belongs to the parent 
element, otherwise it belongs to the presently starting element.
5) You should set the exception object back to null once you have assigned 
it to an element.
6) You should check the exception object in the endElement event and if it 
hasn't yet been allocated then it belongs to the presently ending element.

It might be worth your while putting in a feature request at Bugzilla. 
 Turning around the timing of events would be very difficult and you're not 
likely to get it.  But having more information in the SAXParseException 
about the context of the error would help you with deciding where the error 
belongs is much easier and might get into a future release.

Cheers,
Paul.

-----Original Message-----
From:   Benson Cheng [SMTP:[EMAIL PROTECTED]
Sent:   Friday, August 22, 2003 4:29 AM
To:     [EMAIL PROTECTED]
Subject:        Empty tag handling question

Hi All,

I found that Xerces handles <EmptyTag></EmptyTag> and <EmptyTag/> 
differently when parse with the schema validation.  Assume this <EmptyTag> 
failed the validation,

For <EmptyTag></EmptyTag> case, Xerces calls the handler in the following 
sequence:
startElement
error
endElement

But for <EmptyTag/> case, Xerces calls the handler in the following 
sequence:
error
startElement
endElement

I am not sure the later case is correct or not.  Since my app builds the 
element path when there are errors, the later case reports the wrong 
element path (the parent element).  Is this a bug?  BTW, I am using Xerces 
version 2.4.0.

thanks,
Benson.

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

RE: Empty tag handling question

Reply via email to