I thought this would be really handy when parsing from a continuous buffer
like a MemBufInputSource or a LocalFileInputSource.  I have a situation
where I SAX parse _very_ large XML instances looking for small repeating
fragments.  These fragments are operated on individually by making a DOM to
operating on those nodes in all sorts of application defined ways.  

If I had the functionality described by Ted, I could SAX the file and save
off the starting and ending offsets into the large document.  Post that info
to a thread pool to process the fragments asynchronously.  In fact, I can
use my Win32 memory mapped file input source to SAX the original large file
and serve as a source to the DOM parser during the per work item processing.
The way I'm doing it now involved _way_ too many buffer copies to be really
fast - but it could be.

Jim



> -----Original Message-----
> From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]]
> Sent: Tuesday, April 23, 2002 1:10 PM
> To: [EMAIL PROTECTED]
> Subject: Re: Re: how to access the raw text that generated a sax event
> 
> 
> 
> The problem with using the "locator" is that it only reports 
> line+column info.  Byte offsets into the file would be more 
> helpful for my purposes.
> 
> -ted
> 
> > 
> > From: "Joseph Kesselman/CAM/Lotus" <[EMAIL PROTECTED]>
> > Date: 2002/04/23 Tue AM 08:39:09 EDT
> > To: [EMAIL PROTECTED]
> > Subject: Re: how to access the raw text that generated a sax event
> > 
> > 
> > Best suggestion I've got is to  use the SAX "locator" to 
> find the relevant
> > area of the document, then perform your own primitive 
> parsing to extract a
> > moderately meaningful chunk thereof....
> > 
> > ... but I suspect that's more work than simply using a 
> single parser and
> > routing its SAX events to the appropriate (possibly user-supplied)
> > processor.
> > 
> > 
> > 
> > 
> ---------------------------------------------------------------------
> > To unsubscribe, e-mail: [EMAIL PROTECTED]
> > For additional commands, e-mail: [EMAIL PROTECTED]
> > 
> > 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
> 

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to