Re: Content encoding problem... from Henrik Frystyk Nielsen on 1997-02-19 (ietf-http-wg@w3.org from January to March 1997)

From: Henrik Frystyk Nielsen <frystyk@w3.org>
Date: Wed, 19 Feb 1997 11:05:12 -0500
To: "Roy T. Fielding" <fielding@kiwi.ICS.UCI.EDU>, jg@zorch.w3.org
Cc: http-wg@cuckoo.hpl.hp.com
Message-Id: <3.0.1.32.19970219110512.009e0c40@pop.w3.org>
At 08:52 PM 2/14/97 -0800, Roy T. Fielding wrote:

Sorry for the delay in this thread.

>When I first started testing HTTP/1.0 clients, almost all of them understood
>Content-Encoding.  Are you saying that they have digressed?  Are you sure
>that the tests were not faulty (i.e., was the server output checked to
>be sure that it was actually sending the correct content-type and
>content-encoding headers)?  Or do the failures only apply when "deflate"
>is used as the Content-Encoding?  Note that most current clients will
>only accept "x-gzip" and "x-compress", if anything.

This is not entirely correct. The only reason why the clients that I have
tested look like they understand content encoding is that it always has
been followed by a "application/octet stream" content type which is handled
by dumping it to disk. If you have content type "text/html" which is the
situation in our tests then most clients pass it right through.

>Henrik suggested:
>>What if we said that:
>>
>>"HTTP/1.1 servers or proxies MUST not send any content-encodings other than
>>"gzip" and "compress" to a HTTP/1.0 client unless the client explicitly
>>accepts it using an "Accept-Encoding" header."
>
>No.  Content-Encoding is a property of the resource (i.e., only the origin
>server is capable of adding or removing it on the server-side, and only
>the user agent is capable of removing it on the client-side).  The protocol
>should not dictate the nature of a resource and under what conditions the
>server can send an otherwise valid HTTP entity.  The protocol must remain
>independent of the payload.

But it does already - it is part (and have been for a long time) of the
content negotiation algorithm just like content-type and content-language.
There is nothing wrong in this. 

The reason why this problem has occurred is that the wording in the spec
reflected HTTP/1.0 servers behavior which is to send a "content-encoding:
x-gzip" or whatever without the client asking for it. Instead I ask for it
to be updated so that it takes into account "deflate" which is already
mentioned in the spec.

>Transfer-Encoding, on the other hand, represents HTTP-level encodings.
>If we want to support HTTP-level compression, it must be done at that
>level.

No, as an origin server I want to compress the data once and for all,
compute my hash and other things that depend on the hash like signatures,
PICS labels etc. I can see special cases where compression at the transfer
level is convenient - for example allowing proxies to compress it on the
fly etc. but the first usage is not less important.

>However, I would rather see work being done on HTTP/2.x, wherein
>we could define a tokenized message format which is more efficient than
>just body compression and would result in no worse incompatibilities with
>existing software than adding body compression.

By using pipelining we have reached the upper limit for how fast we can get
the data through on a PPP link. I have a nice example of some xplots on a
PPP link showing this - it's available at

	http://www.w3.org/pub/WWW/Protocols/HTTP/Performance/PipeExamples.html

The only way to optimize it is to send less data. A quick example shows the
significance of compression: of the 180K that is on our microscape
homepage: 42K HTML and 125K GIF we save approx 30K or 1/6 by compressing
_ONLY_ the HTML. This is when using the zlib out-of-the-box. If we optimize
the compression using a HTML adjusted dictionary, I am sure that this can
get even better. Also, when using stylesheets instead of many of the GIFs
this becomes even more obvious.

Henrik
--
Henrik Frystyk Nielsen, <frystyk@w3.org>
World Wide Web Consortium, MIT/LCS NE43-346
545 Technology Square, Cambridge MA 02139, USA
Received on Wednesday, 19 February 1997 08:08:28 UTC