[ 
https://issues.apache.org/jira/browse/XMLRPC-153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12642497#action_12642497
 ] 

Balázs Póka commented on XMLRPC-153:
------------------------------------

Reading http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html over and over 
again, I think I'm beginning have an impression what all of this means.

First of all, a quick recap of the relevant definitions.

In section 14.11, it says: "The Content-Encoding entity-header field is used as 
a modifier to the media-type. When present, its value indicates what additional 
content codings have been applied to the entity-body, and thus what decoding 
mechanisms must be applied in order to obtain the media-type referenced by the 
Content-Type header field. Content-Encoding is primarily used to allow a 
document to be compressed without losing the identity of its underlying media 
type."
-> "Content codings are defined in section 3.5."

Section 3.5 states: "Content coding values indicate an encoding transformation 
that has been or can be applied to an entity. Content codings are primarily 
used to allow a document to be compressed or otherwise usefully transformed 
without losing the identity of its underlying media type and without loss of 
information. Frequently, the entity is stored in coded form, transmitted 
directly, and only decoded by the recipient."

Section 14.41: "The Transfer-Encoding general-header field indicates what (if 
any) type of transformation has been applied to the message body in order to 
safely transfer it between the sender and the recipient. This differs from the 
content-coding in that the transfer-coding is a property of the message, not of 
the entity."
-> "Transfer-codings are defined in section 3.6."

Section 3.6 states: " Transfer-coding values are used to indicate an encoding 
transformation that has been, can be, or may need to be applied to an 
entity-body in order to ensure "safe transport" through the network. This 
differs from a content coding in that the transfer-coding is a property of the 
message, not of the original entity."

It is declared that the set of supported Content-codings ("identity", "gzip", 
"compress", "deflate") is actually a subset of available Transfer-codings 
("chunked", "identity", "gzip", "compress", "deflate") and that 
"Transfer-codings are analogous to the Content-Transfer-Encoding values of 
MIME..."

Let's consider an example where both fields are relevant. Say we have some URI 
whose content a browser is able to display directly. The browser decides 
whether it can do that based on the mime type. That could be an image, a flash 
application, a simple html or xml document, whatever. Suppose this document is 
compressible so it makes sense to compress it. Using the vocabulary of the RFC, 
we modify its media type, which is specified to be "text/xml", using gzip 
Content-encoding. Now the entity the browser downloads from the URI has a 
property of having been filtered through gzip. But underneath that it's still 
"text/xml" so the browser can still use it after applying reverse 
transformations. Suppose now there is a proxy between the server and the 
browser. Some proxies don't handle missing Content-Length headers too well. :) 
In that case, if it's HTTP/1.1 compatible, "chunked" Transfer-encoding may be 
used.

There is a section regarding Message length (4.4), which helps to understand 
that content-encoding is done in an other layer than transfer-encoding, and is 
totally unrelated.

" The transfer-length of a message is the length of the message-body as it 
appears in the message; that is, after any transfer-codings have been applied. 
When a message-body is included with a message, the transfer-length of that 
body is determined by one of the following (in order of precedence):
...
2. If a Transfer-Encoding header field (section 14.41) is present and has any 
value other than "identity", then the transfer-length is defined by use of the 
"chunked" transfer-coding (section 3.6), unless the message is terminated by 
closing the connection. [Meaning that a chunked transfer-encoding implicitly 
specifies the total message length, which is irrelevant here since we don't use 
chunked transfers.]

3. If a Content-Length header field (section 14.13) is present, its decimal 
value in OCTETs represents both the entity-length and the transfer-length. The 
Content-Length header field MUST NOT be sent if these two lengths are different 
(i.e., if a Transfer-Encoding header field is present). If a message is 
received with both a Transfer-Encoding header field and a Content-Length header 
field, the latter MUST be ignored. [So the Content-Length field should EXACTLY 
match the number of bytes transferred. Since there is no Transfer-Encoding 
header (no chunked or anything), this is the key piece of information.]
...
5. By the server closing the connection. (Closing the connection cannot be used 
to indicate the end of a request body, since that would leave no possibility 
for the server to send back a response.) [Very important since this is what 
causes the problem in the first place.]
"

So, I think that the following two examples of headers are functionally 
equivalent, but only the first is supported by HTTP/1.0:
1)
Content-Encoding: gzip
Content-Length: 1234

[Content-Length is the exact number of bytes sent over the wire.]

2) Transfer-Encoding: gzip; chunked

[chunked is mandatory if there is a Transfer-Encoding header, and no 
Content-Length is needed since the size of the message can be calculated 
because of the chunked transfer-encoding. This is why it must be ignored.]

Thanks for reading though this. Hope this helps. :)

> content-length header incorrect when using gzip
> -----------------------------------------------
>
>                 Key: XMLRPC-153
>                 URL: https://issues.apache.org/jira/browse/XMLRPC-153
>             Project: XML-RPC
>          Issue Type: Bug
>    Affects Versions: 3.0, 3.1
>         Environment: UNIX (FC3), Sun JDK1.5.0_10
>            Reporter: Andy Meyer
>         Attachments: patch.txt
>
>
> When doing some testing using the ws-xmlrpc client libraries I ran across a 
> bug in its calculation of the content-length HTTP header when using gzip 
> compression but not HTTP chunked transfer. The client incorrectly sets the 
> content-length to the length of the uncompressed data, rather than the 
> compressed data it sends. This happens using both 3.0 and 3.1 client 
> libraries.
> I see some activity on ws-xmlrpc-dev from September 2007 but no mention of 
> any resolution. I did a quick bug search and found nothing - my apologies if 
> this is already being tracked somewhere else and I missed it.
> From the mail thread, a link to the relevant part of the HTTP spec:
> http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.13

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to