Curtiss Howard <[EMAIL PROTECTED]> wrote on 02/14/2005 06:30:05 AM: > On Mon, 14 Feb 2005 11:40:46 +0100, Jos van den Oever > <[EMAIL PROTECTED]> wrote: > > Hello all, > > > > I've a question concerning DOM build speed. In the version 2.6.2 version of > > Xerces, parsing a simple 825 byte xml file takes between 300 and 400 ms. > > I'm building the DOM from a java.io.Reader. The parse time is independent of > > the type of reader (StringReader, FileReader). > > > > This is the code I use for creating the builder: > > > > DocumentBuilderFactory factory = DocumentBuilderFactory > > .newInstance(); > > System.out.println(factory); > > factory.setNamespaceAware(true); > > factory.setValidating(false); > > factory.setExpandEntityReferences(false); > > docbuilder = factory.newDocumentBuilder(); > > docbuilder.setErrorHandler(errorHandler); > > > > Is the DOM build time always so slow? I've read some benchmarks on the web > > that were notably faster. > > > > Cheers, Jos > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: [EMAIL PROTECTED] > > For additional commands, e-mail: [EMAIL PROTECTED] > > > > > > I've run up against this problem as well. If I'm not mistaken, the > problem is the way Xerces handles setting properties on a DOM parser > via the JAXP interfaces. Xerces will create a new instance (!) of the > DOMParser class for every single property you set (i.e., via > setValidating(), setNamespaceAware(), etc.). This wouldn't be so bad > if it wasn't so expensive to create a DOMParser instance, but it is. > You should really be timing the parse and not the setup and parse.
Actually, this is only a problem with setAttribute(). The other methods just set a boolean which isn't read until the application explicitly creates a parser from the factory. > I was sort of shocked when I first discovered how poorly Xerces > handles setting JAXP DocumentBuilder properties, but from > conversations on this mailing list it's obviously a known problem and > there doesn't seem to be much that can be done about it. Shrug. If anyone's curious what the problem is, here's the explanation [1] I gave the last time this came up. > > Curtiss Howard > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > [1] http://marc.theaimsgroup.com/?l=xerces-j-user&m=110347351515748&w=2 Michael Glavassevich XML Parser Development IBM Toronto Lab E-mail: [EMAIL PROTECTED] E-mail: [EMAIL PROTECTED] --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]