Curtiss Howard <[EMAIL PROTECTED]> wrote on 02/14/2005 06:30:05 AM:

> On Mon, 14 Feb 2005 11:40:46 +0100, Jos van den Oever
> <[EMAIL PROTECTED]> wrote:
> > Hello all,
> > 
> > I've a question concerning DOM build speed. In the version 2.6.2 
version of
> > Xerces, parsing a simple 825 byte xml file takes between 300 and 400 
ms.
> > I'm building the DOM from a java.io.Reader. The parse time is 
independent of
> > the type of reader (StringReader, FileReader).
> > 
> > This is the code I use for creating the builder:
> > 
> >    DocumentBuilderFactory factory = DocumentBuilderFactory
> >      .newInstance();
> >    System.out.println(factory);
> >    factory.setNamespaceAware(true);
> >    factory.setValidating(false);
> >    factory.setExpandEntityReferences(false);
> >    docbuilder = factory.newDocumentBuilder();
> >    docbuilder.setErrorHandler(errorHandler);
> > 
> > Is the DOM build time always so slow? I've read some benchmarks on the 
web
> > that were notably faster.
> > 
> > Cheers, Jos
> > 
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: [EMAIL PROTECTED]
> > For additional commands, e-mail: [EMAIL PROTECTED]
> > 
> > 
> 
> I've run up against this problem as well.  If I'm not mistaken, the
> problem is the way Xerces handles setting properties on a DOM parser
> via the JAXP interfaces.  Xerces will create a new instance (!) of the
> DOMParser class for every single property you set (i.e., via
> setValidating(), setNamespaceAware(), etc.).  This wouldn't be so bad
> if it wasn't so expensive to create a DOMParser instance, but it is. 
> You should really be timing the parse and not the setup and parse.

Actually, this is only a problem with setAttribute(). The other methods 
just set a boolean which isn't read until the application explicitly 
creates a parser from the factory.
 
> I was sort of shocked when I first discovered how poorly Xerces
> handles setting JAXP DocumentBuilder properties, but from
> conversations on this mailing list it's obviously a known problem and
> there doesn't seem to be much that can be done about it.  Shrug.

If anyone's curious what the problem is, here's the explanation [1] I gave 
the last time this came up.

> 
> Curtiss Howard
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
> 

[1] http://marc.theaimsgroup.com/?l=xerces-j-user&m=110347351515748&w=2

Michael Glavassevich
XML Parser Development
IBM Toronto Lab
E-mail: [EMAIL PROTECTED]
E-mail: [EMAIL PROTECTED]


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to