[xml] Questions on usage: xmlTextReaderCurrentDoc, XPath and xmlTextReaderRead

Eric West Mon, 24 Dec 2007 10:38:56 -0800

[disclaimer: I am new to coding with libxml2]

I have a test program to write and read some sample xml. It "works", but I have 
noticed that
valgrind reports some problems related to xmlTextReaderRead and 
xmlNewTextReaderFilename.
[Details below]


My test program uses the TextReader APIs to extract XML content. At parent 
nodes, it utilizes 
XPath queries to extract child and grandchild content. In pseudocode, this is 

     reader = xmlNewReaderFilename();
     ret = xmlReaderRead( reader);
     doc = xmlTextReaderCurrentDoc( reader);
     while ( ret == 1) {
           processNode( reader, doc);
           xmlTextReaderRead( reader);
      }

      xmlFree( doc);
      xmlFreeTextReader( reader);
      xmlCleanupParser();

Within processNode(), I get path context via the doc handle and then make
a series of XPath queries. The various libxml2 free routines are called on
memory allocated as appropriate. valgrind finds no issues in processNode().

Now at the risk of solving my own problem... If I comment out the call to 
processNode, valgrind
still flags memory mismanagement. If I also comment out the call to 
xmlTextReaderCurrentDoc,
voilà! -- valgrind is happy. 

Q: Thus I must conclude that there is an order of operations problem here. I 
noticed that the sample code
textReader3.c does the parsing with xmlTextReader and then calls 
xmlTextReaderCurrentDoc. That
observation and the documentation suggests that the appropriate approach is (a) 
parse the entire
file via xmlTextReader and then (b) get a doc pointer to process the in-memory 
data. Is this correct?

Q: Can xmlTextReader calls be interwoven with XPath queries? (Je pense que 
non.). The python
example does this, but the equivalent in C is not apparent to me. As best I can 
tell one needs a
doc pointer to call xmlXPath API:

        node =  xmlTextReaderExpand( reader);
        ctx = xmlXPathNewContext( docPtr );
        ctx->node = node
        pObj = xmlXPathEval( BAD_CAST xmlXPathQuery, ctx);

Q: Does mixing xmlTextReader API calls with XPath APIs defeat the memory 
utilization benefits
of the xmlTextReader implementation? At least as per the example, the series of 
xmlTextReader calls
will build a tree in-memory so that the subsequent call to 
xmlTextReaderCurrentDoc returns a
pointer to the complete tree.

Thanks in advance.


  --Eric


###################

$ valgrind --leak-check=full ./xmlTest
    ...
==4610== 23 bytes in 3 blocks are definitely lost in loss record 1 of 6
==4610==    at 0x4C21D06: malloc (in 
/usr/lib64/valgrind/amd64-linux/vgpreload_memcheck.so)
==4610==    by 0x4ED27BF: xmlStrndup (in /usr/lib64/libxml2.so.2.6.30)
==4610==    by 0x4E8483E: xmlNewDoc (in /usr/lib64/libxml2.so.2.6.30)
==4610==    by 0x4F20BDD: xmlSAX2StartDocument (in /usr/lib64/libxml2.so.2.6.30)
==4610==    by 0x4E7BBB8: xmlParseChunk (in /usr/lib64/libxml2.so.2.6.30)
==4610==    by 0x4F0C99D: (within /usr/lib64/libxml2.so.2.6.30)
==4610==    by 0x4F0D5CD: xmlTextReaderRead (in /usr/lib64/libxml2.so.2.6.30)
==4610==    by 0x40221E: main (xmlTest.c:492)
==4610==
==4610==
==4610== 4,312 (48 direct, 4,264 indirect) bytes in 1 blocks are definitely 
lost in loss record 3 of 6
==4610==    at 0x4C21D06: malloc (in 
/usr/lib64/valgrind/amd64-linux/vgpreload_memcheck.so)
==4610==    by 0x4F1DA89: xmlDictCreate (in /usr/lib64/libxml2.so.2.6.30)
==4610==    by 0x4E65F94: xmlInitParserCtxt (in /usr/lib64/libxml2.so.2.6.30)
==4610==    by 0x4E6600D: xmlNewParserCtxt (in /usr/lib64/libxml2.so.2.6.30)
==4610==    by 0x4E7E195: xmlCreatePushParserCtxt (in 
/usr/lib64/libxml2.so.2.6.30)
==4610==    by 0x4F0E11C: xmlNewTextReader (in /usr/lib64/libxml2.so.2.6.30)
==4610==    by 0x4F0E587: xmlNewTextReaderFilename (in 
/usr/lib64/libxml2.so.2.6.30)
==4610==    by 0x402206: main (xmlTest.c:488)
==4610==
==4610== LEAK SUMMARY:
==4610==    definitely lost: 71 bytes in 4 blocks.
==4610==    indirectly lost: 4,264 bytes in 5 blocks.
==4610==      possibly lost: 0 bytes in 0 blocks.
==4610==    still reachable: 0 bytes in 0 blocks.
==4610==         suppressed: 0 bytes in 0 blocks.

The problem points seem to be related to xml



-- 
E r i c   W e s t
Spark! Creative Group
Boston, MA 02134-1406
http://www.sparkcg.com

-- E r i c   W e s [EMAIL PROTECTED] o s t o n ,   M A

_______________________________________________
xml mailing list, project page  http://xmlsoft.org/
[email protected]
http://mail.gnome.org/mailman/listinfo/xml

[xml] Questions on usage: xmlTextReaderCurrentDoc, XPath and xmlTextReaderRead

Reply via email to