I'm doing something a little odd (but fairly cool) -- I use the main ContentHandler to delegate calls to an array of ContentHandler. Each of these sub handlers is allowed to modify the parameters of the call before they are sent to the next sub handler and the end result is written to a new xml file. In order to send the proper characters to the second, third, etc sub handlers the full character set needs to be sent to the first sub handler first. Unfortunately I have a mixed content mode, which does make things a bit more complicated. At the moment the dispatching handler is collecting text into a String(thanks for the CharArrayWriter suggestion, that will work much better) and then a character(...) dispatching method is called when startElement or endElement is called.
The point here is that any effeciency the parser gains by being allowed to make multiple character() calls for a single text string is completely wipped out by my having to re-buffer all that data. Your CharArrayWriter suggestion will make it a little more reasonable, but it's a lot of pain for dealing with a 1 time in a thousand occurance, and a lot of extra calls for the other 999 times. Don't let my annoyance on this one issue reflect too much on me, I actually like Crimson quite well. It's easily fast enough for my purposes, works well, is easy to use, easy to extend/modify. In particular I like that all the components are layed out clearly so if I want I can get a specific implementation class and use it. Being able to extend the XmlDocument to provide DOM building after some SAX processing is great. I looked doing that with Xerces and I never did find a similiarly easy method. Kevin Dane Foster wrote: > The fact that for every start tag there must be an end tag makes it > relatively easy to efficiently handle the text between tags. Unless of > course, your XML document has mixed mode elements (text and markup between a > start and an end tag). Mixed mode complicates the matter a little bit more. > If you are not already doing so, I recommend that you use a CharArrayWriter > to handle the capturing of the text in the characters method then convert > the characters to a String in the endElement method. Do not forget to reset > the CharArrayWriter after you have converted the characters to a String. > > Dane Foster > Equity Technology Group, Inc > http://www.equitytg.com. > 954.360.9800 > ----- Original Message ----- > From: "Kevin Steppe" <[EMAIL PROTECTED]> > To: <[EMAIL PROTECTED]> > Sent: Thursday, February 15, 2001 9:33 PM > Subject: Re: Crimson bug? > > It's still a pain. It means that I can't do actual processing within the > characters > call back and have to wait until some other call back. For what I'm doing > the > ContentHandler actually -can't- wait and so I've had to rebuffer the > character data. > The end result is lower effeciency because of all the buffering and method > calls I > have to make to get around the 'efficiency' on the parser side. > > Kevin > > Edwin Goei wrote: > > > Kevin Steppe wrote: > > > > > > Well that's a pain, but at least it's a sold answer I can work with and > rely on. > > > > It is for efficiency because it allows the parser to buffer data. > > > > -Edwin > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: [EMAIL PROTECTED] > > For additional commands, e-mail: [EMAIL PROTECTED] > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED]
