I am at a point where I am investigating the usage for xmlReader as I have a
requirement of parsing large xml file that cannot be held in memory
which is why I had a few questions about this.
One of the other things that I have in my environment are that I need to use
IO based reading ( as compared to file or memory based reading)
Due to this i was am using the Push parser mechanism .
Below is the pseudo code for my parsing logic using DOM
Current DOM pseudo - code for Parsing using DOM , note that I can only read
1024 bytes from the stream as part of one read operation
IOObject mStream;
xmlParserCtxtPtr ctxt;
char pchar[1024];
int res = mStream.ReadBytes(pChar , 4);
if(res)
{
xmlParserCtxtPtr ctxt= xmlCreatePushParserCtxt(NULL,NULL,
(const char *)pChar, res,NULL);
res = 0;
do
{
res = mStream.ReadBytes(pChar,fileSizeBytes);
if(res)
errCode = xmlParseChunk(ctxt, (const char *)pChar, res, 0);
} while(res>0 && errCode==XML_ERR_OK);
if( errCode == XML_ERR_OK)
{
xmlParseChunk(ctxt, (const char *)pChar, res, 1))
// Do something , set output
}
xmlFreeParserCtxt(ctxt);
}
Proposed xmlReader code
IOObject mStream;
char pchar[1024];
xmlTextReaderPtr reader = xmlReaderForIO (<params( callback functions for IO
)>);
while(xmlTextReaderRead(reader))
{
// Do something with this node
}
xmlFreeTextReader(reader);
My question is that , given this scenario , does xmlreader still save me
memory ,compared to DOM ( in terms of storage allocated for parsing xml) as
bytes from the stream would need to still
be cached to decipher node information.
_______________________________________________
xml mailing list, project page http://xmlsoft.org/
[email protected]
http://mail.gnome.org/mailman/listinfo/xml