Much appreciated. I've have a look at all those links

Cheers

-----Original Message-----
From: Dennis Sosnoski [mailto:[EMAIL PROTECTED]
Sent: Friday, 9 January 2004 1:22 PM
To: [EMAIL PROTECTED]
Subject: Re: Internal Xerces structure available?


Farmer, Rod (Contractor) wrote:

>The only rocks I've run into with this are the ones that say it's okay 
>to gzip XML documents for transmission, but any other form of 
>transformation is verboten... :-)
>
>
>- Dennis
>
>Could you elaborate on this slightly. Compression is highly desirable in our 
>situation with very large documents being sent across the network.
>
>Many thanks
>
>
>Rod
>
>  
>
Hi Rod,

You can see the performance results I've obtained with XBIS, a compact 
encoding for XML data, at http://xbis.sourceforge.net/performance.html 
XBIS eliminates much of the markup overhead from XML text, and also 
keeps the character data in a form that can be processed with much less 
overhead than parsing. Across the variety of XML documents I've used in 
my tests the XBIS representation is about half the size of the 
equivalent text, so it's not only much faster to process but also 
smaller. The current XBIS implementation is strictly Java, but the 
encoding is language-independent and could easily be implemented in 
C/C++ or any other reasonable language.

gzip-style compression of text XML will give you much smaller 
representations of the data (about 8:1 over text, across my test 
documents), but at least for my experiments this comes at the cost of 
about doubling the processing overhead of plain text (making it about 
12-18x slower than XBIS).

However, XBIS is not necessarily going to help a lot if your overhead is 
coming mainly from converting objects to and from XML. If you're looking 
for a way to avoid the conversions of primitive values to and from text, 
something like Sun's "Fast Web Services" approach 
(http://java.sun.com/developer/technicalArticles/WebServices/fastWS/index.html) 
would probably work. This basically converts an XML schema into a binary 
transmission format that gets serialized from and deserialized to Java 
objects. The work they've done is strictly in a web services context, 
and AFAIK strictly Java, but there seems to be a lot of interest in this 
type of binary representation.

My own JiBX project (http://www.jibx.org) implements a fast data binding 
solution that converts between objects and normal XML. At this point I'm 
still using the standard Java libaries for handling floating point value 
conversions, though, so if that's your big problem JiBX won't help. That 
may change in the future if it looks like floating point conversions are 
a problem - I've included custom handling for ints and date/time values 
already, partially because the libary code was too much of a bottleneck.

If you have any other questions feel free to contact me directly, since 
this is getting pretty far off topic from Xerces.

  - Dennis




---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to