RE: DOM performance/coding issue

Erik Rydgren Mon, 15 Sep 2003 01:21:47 -0700

Oh crap...

I forgot one line in the code. Fixed code below.


DOMNode* pNode = m_pDocument->getDocumentElement()->getFirstChild();
while (pNode) {
  if (pNode->getNodeType() == DOM_NODE && equal(pNode->getNodeName(),
"PatchData")) {
    DOMNode* pInnerNode = pNode->getFirstChild();
    while (pInnerNode) {
      if (pInnerNode->getNodeType() == DOM_NODE &&
equal(pInnerNode->getNodeName(), "CbName")) {
        if (equal(pInnerNode->getNodeValue(), "your searchstring "))
          // Bingo!!! We found it, process node
          break; // No need to search for more inner nodes
        }
      }
      pInnerNode = pInnerNode->getNextSibling(); // <-- this one
    }
  }
  pNode = pNode->getNextSibling();
}

/ Erik

> -----Original Message-----
> From: Erik Rydgren
> Sent: den 15 september 2003 10:13
> To: [EMAIL PROTECTED]
> Subject: DOM performance/coding issue
> 
> As previously stated it is the getElementsByTagNameNS that is the bad
> boy in your code. What the previous writers didn't explain was why it
is
> so.
> 
> The getElementsByTagNameNS operate on ALL nodes in a subtree. It
> therefore has to traverse the whole tree and compare tagnames of ALL
> elements against your string. If your document is large then you'll
get
> the picture. It is as inefficient as doing a tablescan for each
database
> lookup. Of course if you can't do any assumptions how your
datastructure
> looks like, then the getElementsByTagNameNS method is very powerful.
But
> in your case it sounds like you already know how your datastructure
look
> like and therefore you can write more efficient code by doing the
> iteration and comparison yourself.
> 
> Something like this (I have used a pseudo function equal to clarify
the
> code):
> 
> DOMNode* pNode = m_pDocument->getDocumentElement()->getFirstChild();
> while (pNode) {
>   if (pNode->getNodeType() == DOM_NODE && equal(pNode->getNodeName(),
> "PatchData")) {
>     DOMNode* pInnerNode = pNode->getFirstChild();
>     while (pInnerNode) {
>       if (pInnerNode->getNodeType() == DOM_NODE &&
> equal(pInnerNode->getNodeName(), "CbName")) {
>         if (equal(pInnerNode->getNodeValue(), "your searchstring "))
>           // Bingo!!! We found it, process node
>           break; // No need to search for more inner nodes
>         }
>       }
>     }
>   }
>   pNode = pNode->getNextSibling();
> }
> 
> But if you know that you'll never will see other nodes than PatchData
> nodes under the rootnode you don't have to compare tagname for those
> either.
> 
> Regards
> Erik Rydgren
> Mandarin FS
> Sweden
> 
> > -----Original Message-----
> > From: David Hoffer [mailto:[EMAIL PROTECTED]
> > Sent: den 15 september 2003 00:06
> > To: [EMAIL PROTECTED]
> > Subject: DOM performance/coding issue
> >
> > I have a question about how to effectively parser a large DOM
> document.
> > For
> > example, I have a lot of 'PatchData' elements.  I am looking for the
> set
> > of
> > these where the child element 'CbName' matches a certain string.
The
> > following code takes 30 seconds just to loop through the DOMNodeList
> > before
> > it finds the first matching child.  How can I make this faster?
> >
> > DOMNodeList* pDOMNodeList =
> > m_pDocument->getDocumentElement()->getElementsByTagNameNS(NULL,
> > L"PatchData");
> >
> > for (int i=0; i<pDOMNodeList->getLength(); i++)
> > {
> >     DOMNode* pDOMNode = pDOMNodeList->item(i);
> >     DOMNodeList* pDOMNodeList = ((const
> > DOMElement*)pDOMNode)->getElementsByTagNameNS(NULL, L"CbName");
> >     pDOMNode = pDOMNodeList->item(0);
> >     DOMNode* pChildNode = pDOMNode->getFirstChild();
> >     std::wstring wstrTagValue = pChildNode->getNodeValue();
> >
> >     if (wcscmp(wstrTagValue.c_str(), m_wstrColorbarName.c_str()) ==
> 0)
> >     {
> >             // This is one of the ones I want...I spend very little
> time
> > here.
> >     }
> >
> >     // I spend 30 seconds here, in a loop...
> > }
> >
> > My XML structure is like...
> > <PatchData>
> >     <CbName>abc</CbName>
> >     <Location>0</Location>
> >     <Type>0</Type>
> >     <Width>5.15</Width>
> >     <Enabled>true</Enabled>
> > </PatchData>
> > <PatchData>
> >     <CbName>abc</CbName>
> >     <Location>1</Location>
> >     <Type>1</Type>
> >     <Width>5.25</Width>
> >     <Enabled>false</Enabled>
> > </PatchData>
> >
> > -dh
> 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]



---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

RE: DOM performance/coding issue

Reply via email to