I was able to run some more tests comparing the performance of transformations using SAXSource vs. DOMSource and discovered the following:
When using SAXSource there transformation time depends greatly on the size of the text in the transformed documents (i.e. the cumulative size of the char arrays passed in the SAX characters callback). when using DOMSource there is little difference in the transformation timings regardless of the cumulative characters number. In the tests i tried the following scenarios: 1. pass a constant char array of size 1 to the SAXSource or a constant new String (again of size 1) to the DOMSource. Here the SAXSource based transformation was about 15% faster. 2. passed a constant char array (of size 170) to the SAXSource and a constant new string (of the same size) to the DOMSource. The DOM based transformation took pretty much the same time, while the SAX based one took much longer (by a factor of 25 !!!). Any ideas? Cheers, Shmul > On 4 Mar 2004 at 18:11, Santiago Pericas-Geertsen wrote: > Shmul, > > If I understand your architecture correctly, I believe the difference > may be in how fast can your SAX adapter push all the events vs. how > fast can a DOM2SAX adapter do the same from a DOM. If the former is > slower, or has scalability problems, that may be the difference that > you're seeing. > > -- Santiago > > On Wed, 2004-03-03 at 11:51, [EMAIL PROTECTED] wrote: > > The flow of data is the following: > > > > The parser used is our own parser that builds a compact > > representation of the source XML and then fires up events (in a SAX > > like manner but simplified) to event sinks. We have two such sinks > > implementations: > > > > 1. SAX adapter (i.e. an implementation of XMLReader) that is used as > > the SAXSource. In this scenario we transform the internal stream > > directly, i.e. without an intermediate step. > > > > 2. DOM builder that uses the events to create the document. The > > document is passed as a DOMSource to the transformation. > > > > The time difference I mentioned does NOT include the parsing > > process, only the actual time spent on the transformation. In the > > DOM scenario we DO include the time it takes to build the DOM > > document (which is actually negligible). > > > > I think the difference arises from the rather large number of tags > > and the text nodes they contain. I'm not familiar with the internal > > implementation (so pardon me if I'm totally off) but I would assume > > the DOMSource uses iteration to create the DTM while the SAXSource > > naturally depends on callbacks. > > > > > > Cheers, > > Shmul > > > > > > On 3 Mar 2004 at 11:04, Joseph Kesselman wrote: > > > > > > > > > > > > > > > > > On Wednesday, 03/03/2004 at 03:12 ZE2, [EMAIL PROTECTED] wrote: > > > > We have recently changed our usage of Xalan from supplying a > > > > DOMSource to SAXSource. Our performance tests show some > improvement > > > > (10% or so) for XMLs that are around 40K in size (a couple of > > > > hundreds of tags and attributes) , but for a large XML (1.3MB, > > > > around 20K tags with no attributes) the performance of the > DOMSource > > > > is about twice as fast. > > > > > > That is surprising... unless you're doing something strange like > using > > > SAXSource but feeding it from a DOM, so you're paying all the cost > of > > > first building a DOM and then copying the whole tree into Xalan's > data > > > model. > > > > > > How are you actually invoking the API? > > > > > > ------------------------------------------------------------ > > > Mail was checked for spam by the Freeware Edition of No Spam > > > Today! The Freeware Edition is free for personal and > > > non-commercial use. > You > > > can remove this notice by purchasing a full license! To order or > > > to find out more please visit: http://www.no-spam-today.com > > > >
