Farmer, Rod (Contractor) wrote:

The only rocks I've run into with this are the ones that say it's okay to gzip XML documents for transmission, but any other form of transformation is verboten... :-)


- Dennis

Could you elaborate on this slightly. Compression is highly desirable in our 
situation with very large documents being sent across the network.

Many thanks


Rod



Hi Rod,

You can see the performance results I've obtained with XBIS, a compact encoding for XML data, at http://xbis.sourceforge.net/performance.html XBIS eliminates much of the markup overhead from XML text, and also keeps the character data in a form that can be processed with much less overhead than parsing. Across the variety of XML documents I've used in my tests the XBIS representation is about half the size of the equivalent text, so it's not only much faster to process but also smaller. The current XBIS implementation is strictly Java, but the encoding is language-independent and could easily be implemented in C/C++ or any other reasonable language.

gzip-style compression of text XML will give you much smaller representations of the data (about 8:1 over text, across my test documents), but at least for my experiments this comes at the cost of about doubling the processing overhead of plain text (making it about 12-18x slower than XBIS).

However, XBIS is not necessarily going to help a lot if your overhead is coming mainly from converting objects to and from XML. If you're looking for a way to avoid the conversions of primitive values to and from text, something like Sun's "Fast Web Services" approach (http://java.sun.com/developer/technicalArticles/WebServices/fastWS/index.html) would probably work. This basically converts an XML schema into a binary transmission format that gets serialized from and deserialized to Java objects. The work they've done is strictly in a web services context, and AFAIK strictly Java, but there seems to be a lot of interest in this type of binary representation.

My own JiBX project (http://www.jibx.org) implements a fast data binding solution that converts between objects and normal XML. At this point I'm still using the standard Java libaries for handling floating point value conversions, though, so if that's your big problem JiBX won't help. That may change in the future if it looks like floating point conversions are a problem - I've included custom handling for ints and date/time values already, partially because the libary code was too much of a bottleneck.

If you have any other questions feel free to contact me directly, since this is getting pretty far off topic from Xerces.

 - Dennis




--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]



Reply via email to