On Fri, 2004-10-01 at 09:04, Dan White wrote:
> The idea is to take advantade of the SAX and its ability to process a
> stream of non-specific length and combine that with the data organization
> of DOM.

Are you saying that you need to handle an input stream containing
multiple XML documents? I needed to do this a couple of years ago and
asked about doing this in xerces-j. The answer was basically "no, and we
aren't interested in handling this". There was a valid point made that
document processing instructions can occur even after the end of the
document root element, though in practice this would be extremely rare.
As far as I am aware, you can (a) handle startElement/endElement sax
events, keeping track of the element depth and when it reaches zero
somehow fudge the parser's input stream to return EOF or (b) have some
external protocol that allows you to detect boundaries between XML
documents in the input stream (eg insist each is packaged as a MIME
part). I think (a) will work, though if the parser does "read-ahead"
this could cause problems.

Regarding the ability to "parse input of any length": as others have
said, if you are determined to build a DOM model of the input in memory,
then you *must* have enough RAM to hold that model. As others mave
mentioned, the parse methods that build a DOM don't read the whole
buffer into memory before starting parsing; the parsing phase works
exactly the same as the SAX parsing.

The alternative is *not* to build a DOM model at all, but to handle the
SAX events and perform whatever processing you want immediately. But
that isn't what you asked about...

Note also that the DOM model of an xml document is many times larger
than the original file. So if you don't have enough RAM to read the
original file into memory (though this doesn't happen anyway), you won't
have enough memory to hold the DOM model of that document.

Regards,

Simon


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to