Re: i28 proposed replacement text from Adrien de Croy on 2008-05-12 (ietf-http-wg@w3.org from April to June 2008)

From: Adrien de Croy <adrien@qbik.com>
Date: Tue, 13 May 2008 09:59:32 +1200
To: Henrik Nordstrom <henrik@henriknordstrom.net>, ietf-http-wg@w3.org
Message-ID: <4828BDC4.1000103@qbik.com>
Transfer-encoding and Content-Encoding are fundamentally different.  It 
helps if you look at it from the point of view of who does the encoding.

Transfer-Encoding is performed on the fly by something in the stream 
(e.g. proxy or output conversion process).  In such cases it's often 
impossible (i.e. non-deterministic length of output of encoding) to know 
the length of the whole transformed entity.  You may have the 
content-length (an entity attribute) as input to the transformation 
process, but may not be able to calculate the length after encoding on 
the fly.  Therefore the only way to signal length is to use chunking or 
close the connection.

Content-Encoding is different because the sender should know the length, 
therefore can set Content-Length headers. It is deemed a separate entity 
- an attribute of which is an encoding, but as far as HTTP is concerned 
it may as well not be encoded.  The encoding is meant for the end 
consumer of the message.

Sure the underlying technology is similar, but the message semantics are 
completely different.

That's also why Content-length is banned with Transfer-Encoding, because 
Content-Length is an entity attribute, just like Content-Type, and 
Content-Encoding.

Adrien

Jamie Lokier wrote:
> Henrik Nordstrom wrote:
>   
>> On mån, 2008-05-12 at 18:52 +0100, Jamie Lokier wrote:
>>
>>     
>>> Doing that is not a HTTP proxy per spec, but it done nonethless in
>>> some configurations, and it is useful.
>>>       
>> And breaking the evolution of HTTP quite noticeably.. Try deploying a
>> for example a WebDAV client behind such transforming proxy, or a client
>> fetching ranges.
>>     
>
> If a WebDAV client says "Accept-Encoding: gzip" it will probably get
> similar issues even with no proxy.
>
> Many generic HTTP servers act as transformative "pseudo-proxies" to
> their backend content - consider Apache with mod_gzip for example.
> Therefore, WebDAV clients for general purpose use should not say
> "Accept-Encoding: gzip" unless they handle the consequences, which
> typically means transparently decompressing what's received.  They
> don't have to, but user expectations won't be met when connecting to
> some servers, and editing may fail.
>
> Range requests: if the proxy is written properly it can work.
>
> HTTP evolution: Proxies for general HTTP use, such as at ISPs and
> gateways, should not be configured that way.
>
> Only proxies for specific applications would enable transformations
> like that (we hope).  An example is Apache with mod_gzip+mod_proxy
> acting as a reverse proxy in a server farm (I don't know if that
> really works).  That is why I call it a configuration issue.
>
>   
>>>>> So that makes compression independent of transfer encoding.
>>>>>           
>>>> ?
>>>>         
>>> In practice.
>>>       
>> I disagree. There is a lot more to HTTP than plain browsing, and these
>> proxies bending the HTTP often do so without knowing HTTP or the bad
>> effects they cause, and the ones deploying it often considers HTTP
>> "browsing only, nothing critical if it gets a bit messed up as long as
>> browsing to the major sites works".
>>     
>
> This is more like "as long as using major browsers (site irrelevant)
> works, or as long as using a client intended to generally work with
> sites found on the net (because mod_gzip is popular enough that even
> non-browser clients must work with it, or not use Accept-Encoding)."
> It is indeed dirty, but not as specifically dirty as you make out.
>
> It's also not common to do this in proxies, so don't worry about it.
> What is common is automatic compression a la mod_gzip, in what is
> technically not a HTTP proxy, but is still a generic relay between
> HTTP client and HTTP services, and similar non-transparency issues do
> apply there.
>
> Besides, I bet a HTTP proxy which opportunistically applies
> "Transfer-Encoding: gzip" encoding when permitted, and adds "TE: gzip"
> to requests removing the encoding from forwarded responses, will cause
> problems too - maybe even bigger ones - even though it's fully
> compliant and transparent according to spec.
>
>   
>>>> Which is partly why specs clearly say that if Transfer-Encoding is used
>>>> then Content-Length MUST be ignored, with the small exception for the
>>>> now removed case of "Transfer-Encoding: identity".
>>>>         
>>> I was meaning Content-Length in conjunction with Content-Encoding, not
>>> Transfer-Encoding.
>>>       
>> And where is the confusion there?
>>
>> Content-Length with Content-Encoding is the message length, nothing
>> else. Anyone getting this wrong is seriously flawed.
>>
>> Content-Encoding is a property of the resource returned, not of how it's
>> transferred.  Content-Encoding does NOT change the message format, only
>> the resource transferred. To the protocol very similar to
>> Content-Language or Content-Type but on a different axis.
>>     
>
> I know.  But spec isn't everything.
>
> The serious flaw is deployed.  I'm not surprised - it's a predictable
> mistake given how HTTP systems are architected.  When writing code you
> can't ignore the installed base of buggy agents if you want to
> interoperate.  But as I've implied, that particular bug is found (as
> far as I know) only in old agents which are dwindling in presence, so
> you might choose to ignore it now, depending on how much you care
> about reaching those remaining.
>
> -- Jamie
>
>   

-- 
Adrien de Croy - WinGate Proxy Server - http://www.wingate.com
Received on Monday, 12 May 2008 21:58:40 UTC