During research for a big java application relying heavily on XML (and XSL) I have compared several XML Parsers written in Java. I`ve tested 5 Parsers for feature richness, XML and XSL standards support, speed, developer friendliness (like this list) and stability. The tests were very simple and I don`t attempt to optimize anything or do any advanced charts or stuff like that. Here are my results: Aelfred (a very small SAXParser), discarded for age and lack of features. Speed test wasn`t run. XP( James Clark`s parser), discarded for lack of DOM support. Speed test wasn`t run.
So for the really important stuff: General: Xerces-J 1.0.1: Fully compliant XML implementation (DOM 1, SAX 1 although we can`t seem to find a selectNodes(XSLPATTERN,...) function <-hint ), Very feature rich collection of additional APIs for serialization and so on,XSLT support in through Xalan 0.19 (when is the 1.0 out ?), Open Source & Possibility of our developers codeveloping/learning about an XML Parser. Test for Memory usage hinted at a very well behaved DOMParser (compared esspecially to Oracle`s Parser). Oracle XMLParser v2: Fully compliant XML implementation( DOM 1 and SAX 1), fully integrated XSL processor, not open source. Test for Memory usage hinted at a very very memory hungry Parser overall. Developer support very low level and non supportive. No codevelopment possible. Java API for XML Parsing early access 1: Fully compliant XML implementation( DOM 1 and SAX 1), no XSL processor (caused us to drop this parser). Not open source. Test for Memory usage hinted that this was the best behaved parser memory wise. Developer support exists through java.sun.com`s Tutorials and the JDC. No codevelopment possible (We ignore the community forum as we don`t think that model works). Speed: This was the area upon which we fixated most. We have the need to be able to parse both very small, big and HUGE XML-files. All tests were run on a Windows 2000 Server (which :-) has already crashed mysteriously 5 times ), using JDK 1.2.2 from JavaSoft and using code which largely looks like this: Pseudo-code: void main () { Parser parser = new Parser() //both sax and dom before = System.currentTimeMillis(); parser.parse(url or inputsource); after = System.currentTimeMillis(); System.out( "Test.xml parsed in: "+ (after-before) ); } This yielded the following results (all are averages of 10 seperate runs of the application): HotSpot 1.0.1: DOM: 115 kb size xmlfile Xerces 1.0.1: 1282 ms. JAXP 1.0 ea1: 1553 ms. Oracle XmlParser v2: 1121 ms. SAX: 8.975 kb size xmlfile Xerces 1.0.1: 6158 ms. JAXP 1.0 ea1: 4366 ms. Oracle XmlParser v2: 4366 ms. Classic VM (JDK 1.2.2): DOM: 2.436 kb size xmlfile Xerces 1.0.1: 11016 ms. JAXP 1.0 ea1: 5358 ms. Oracle XmlParser v2: 4366 ms. SAX: 115 kb size xmlfile Xerces 1.0.1: 661 ms. JAXP 1.0 ea1: 771 ms. Oracle XmlParser v2: 551 ms. 8.975 kb size xmlfile Xerces 1.0.1: 4156 ms. JAXP 1.0 ea1: 6029 ms. Oracle XmlParser v2: 3936 ms. Of course my tests weren`t very thorough but I think that this is still enough to see a trend amongst the parsers. Oracle and Xerces are clearly neck at neck. While Xerces is better behaved memory wise and very very feature rich, the Oracle Parser is faster, but is also very memory hungry and isn`t open source. Now we`d rather use the Xerces Parser, but if doesn`t allow the selection of nodes through an XSL pattern, we just can`t stake ourselves to it. My results are therefore still slightly inconclusive. Mikael Helbo Kjær Software Developer @ DIA a/s