I thought this would be really handy when parsing from a continuous buffer like a MemBufInputSource or a LocalFileInputSource. I have a situation where I SAX parse _very_ large XML instances looking for small repeating fragments. These fragments are operated on individually by making a DOM to operating on those nodes in all sorts of application defined ways.
If I had the functionality described by Ted, I could SAX the file and save off the starting and ending offsets into the large document. Post that info to a thread pool to process the fragments asynchronously. In fact, I can use my Win32 memory mapped file input source to SAX the original large file and serve as a source to the DOM parser during the per work item processing. The way I'm doing it now involved _way_ too many buffer copies to be really fast - but it could be. Jim > -----Original Message----- > From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]] > Sent: Tuesday, April 23, 2002 1:10 PM > To: [EMAIL PROTECTED] > Subject: Re: Re: how to access the raw text that generated a sax event > > > > The problem with using the "locator" is that it only reports > line+column info. Byte offsets into the file would be more > helpful for my purposes. > > -ted > > > > > From: "Joseph Kesselman/CAM/Lotus" <[EMAIL PROTECTED]> > > Date: 2002/04/23 Tue AM 08:39:09 EDT > > To: [EMAIL PROTECTED] > > Subject: Re: how to access the raw text that generated a sax event > > > > > > Best suggestion I've got is to use the SAX "locator" to > find the relevant > > area of the document, then perform your own primitive > parsing to extract a > > moderately meaningful chunk thereof.... > > > > ... but I suspect that's more work than simply using a > single parser and > > routing its SAX events to the appropriate (possibly user-supplied) > > processor. > > > > > > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: [EMAIL PROTECTED] > > For additional commands, e-mail: [EMAIL PROTECTED] > > > > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
