Xerces-C hot spot in XMLAttr::set

Arnold, Curt 16 Feb 2000 19:06:24 -0000

I did a profile of a SAX parse using Xerces 1.1.0 d05 and Numega's TrueTime and 
found what looks to be a very ripe opportunity for a substantial performance 
gain.


The most obvious hot-spot was XMLAttr::set() which accounted for 26% of the 
total time.  I've reproduced the code here:

void XMLAttr::set(  const   unsigned int        uriId
                    , const XMLCh* const        attrName
                    , const XMLCh* const        attrPrefix
                    , const XMLCh* const        attrValue
                    , const XMLAttDef::AttTypes type)
{
    // Clean up the old stuff
    delete [] fName;
    fName = 0;
    delete [] fPrefix;
    fPrefix = 0;
    delete [] fValue;
    fValue = 0;

    // And clean up the QName and leave it undone until asked for
    delete [] fQName;
    fQName = 0;

    // And now replicate the stuff into our members
    fType = type;
    fURIId = uriId;
    fName = XMLString::replicate(attrName);
    fPrefix = XMLString::replicate(attrPrefix);
    fValue = XMLString::replicate(attrValue);
}

XMLAttr seems to be called XMLAttr::XMLAttr() (10 times) and 
XMLScanner::scanStartTag() (4564 times) in my sample document.   57% of the 
time is spent in the operator []'s and 36% in the replicates.

Since attributes are generally about the same size, it would seem to be much 
more efficient to attempt to reuse the existing "stuff" instead of explicitly 
deleting it then reallocating it.  I don't
think that it requires anything as sophisticated as a string pool.  Maybe just 
always allocate fName (for example) at least, say, 64 characters.  If the new 
values is less than 64 characters, copy the
new value into fName otherwise allocate a longer fName.

Another anomaly is that I was not doing a validating parse, but I spent 10% of 
the time in DTDValidator::scanDTD().

I wish I could give some more insights, but I haven't found the trick to set 
breakpoints on code in Xerces Lib in VC6.  I think it has been covered here 
before, but I wasn't able to locate it.  If
someone wants to send me a pointer, that would be helpful.

Xerces-C hot spot in XMLAttr::set

Reply via email to