Looking at these test results, I think that there's a good chance that most
of the time being reported is spent by the Java JIT compilers in compiling
the code, rather than in XML parse itself. I've seen this happen on timing
tests of Java programs that I've done here.
The way to factor out the JIT time is to run the test code (parse the file)
once first, and then do the timing test for real. This assumes that what
you want to measure is performance after an application is fully up and
running, as opposed to start-up time.
Trying to account for garbage collection times in Java benchmarks is another
real challenge. The problem is that, in a short test, the difference
between zero or one GC, or between one and two, can be huge, and can
completely obscure the time spent in actual code. One solution is to run
the test in a loop, with run times of several minutes, and with at least a
hundred or so GCs over the durtion of the test. This has two big benefits:
1) one more or fewer GCs will have a negligible impact on the overall
results, and 2) the memory allocation/collection costs are reasonably
factored into the overall results.
I'm glad that you posted your numbers, and, if you time to take this effort
any further, I will look forward to those results too.
Regards,
-- Andy
----- Original Message -----
From: "Mikael Helbo Kj�r" <[EMAIL PROTECTED]>
To: <[EMAIL PROTECTED]>
Sent: Monday, February 21, 2000 2:31 AM
Subject: Results of XMLParser comparison
> During research for a big java application relying heavily on XML (and
XSL)
> I have compared several XML Parsers written in Java. I`ve tested 5 Parsers
> for feature richness, XML and XSL standards support, speed, developer
> friendliness (like this list) and stability. The tests were very simple
and
> I don`t attempt to optimize anything or do any advanced charts or stuff
like
> that. Here are my results:
> Aelfred (a very small SAXParser), discarded for age and lack of features.
> Speed test wasn`t run.
> XP( James Clark`s parser), discarded for lack of DOM support. Speed test
> wasn`t run.
>
> So for the really important stuff:
> General:
> Xerces-J 1.0.1:
> Fully compliant XML implementation (DOM 1, SAX 1 although we can`t
> seem to find a selectNodes(XSLPATTERN,...) function <-hint ), Very feature
> rich collection of additional APIs for serialization and so on,XSLT
support
> in through Xalan 0.19 (when is the 1.0 out ?), Open Source & Possibility
of
> our developers codeveloping/learning about an XML Parser. Test for Memory
> usage hinted at a very well behaved DOMParser (compared esspecially to
> Oracle`s Parser).
>
> Oracle XMLParser v2:
> Fully compliant XML implementation( DOM 1 and SAX 1), fully
> integrated XSL processor, not open source. Test for Memory usage hinted at
a
> very very memory hungry Parser overall. Developer support very low level
and
> non supportive. No codevelopment possible.
>
> Java API for XML Parsing early access 1:
> Fully compliant XML implementation( DOM 1 and SAX 1), no XSL
> processor (caused us to drop this parser). Not open source. Test for
Memory
> usage hinted that this was the best behaved parser memory wise. Developer
> support exists through java.sun.com`s Tutorials and the JDC. No
> codevelopment possible (We ignore the community forum as we don`t think
that
> model works).
>
> Speed: This was the area upon which we fixated most. We have the need to
be
> able to parse both very small, big and HUGE XML-files. All tests were run
on
> a Windows 2000 Server (which :-) has already crashed mysteriously 5
times ),
> using JDK 1.2.2 from JavaSoft and using code which largely looks like
this:
> Pseudo-code:
>
> void main ()
> {
> Parser parser = new Parser() file://both sax and dom
> before = System.currentTimeMillis();
> parser.parse(url or inputsource);
> after = System.currentTimeMillis();
> System.out( "Test.xml parsed in: "+ (after-before) );
> }
>
> This yielded the following results (all are averages of 10 seperate runs
of
> the application):
>
> HotSpot 1.0.1:
> DOM:
> 115 kb size xmlfile
> Xerces 1.0.1: 1282 ms.
> JAXP 1.0 ea1: 1553 ms.
> Oracle XmlParser v2: 1121 ms.
>
> SAX:
> 8.975 kb size xmlfile
> Xerces 1.0.1: 6158 ms.
> JAXP 1.0 ea1: 4366 ms.
> Oracle XmlParser v2: 4366 ms.
>
> Classic VM (JDK 1.2.2):
> DOM:
> 2.436 kb size xmlfile
> Xerces 1.0.1: 11016 ms.
> JAXP 1.0 ea1: 5358 ms.
> Oracle XmlParser v2: 4366 ms.
>
> SAX:
> 115 kb size xmlfile
> Xerces 1.0.1: 661 ms.
> JAXP 1.0 ea1: 771 ms.
> Oracle XmlParser v2: 551 ms.
>
> 8.975 kb size xmlfile
> Xerces 1.0.1: 4156 ms.
> JAXP 1.0 ea1: 6029 ms.
> Oracle XmlParser v2: 3936 ms.
>
> Of course my tests weren`t very thorough but I think that this is still
> enough to see a trend amongst the parsers. Oracle and Xerces are clearly
> neck at neck. While Xerces is better behaved memory wise and very very
> feature rich, the Oracle Parser is faster, but is also very memory hungry
> and isn`t open source. Now we`d rather use the Xerces Parser, but if
doesn`t
> allow the selection of nodes through an XSL pattern, we just can`t stake
> ourselves to it. My results are therefore still slightly inconclusive.
>
> Mikael Helbo Kj�r
> Software Developer @ DIA a/s
>