ISSUE CONTENT-ENCODING: Proposed wording

These are the suggested changes regarding the problems in content-encoding
and accept-encoding as described in

	http://www.w3.org/Protocols/HTTP/Issues/#CONTENT-ENCODING

>Here it is, along with other related changes
>
>(1) in section 3.5 (Content Codings), add this after the item
>    for "deflate"
>
>        identity        The default (identity) encoding; the use
>                        of no transformation whatsoever.  This
>                        content-coding is used only in the
>                        Accept-encoding header, and SHOULD NOT
>                        be used in Content-coding header.

The word "identity" would never have to go over the wire, right? It is
always implicit so that "Accept-Encoding:" with no tokens means the same
thing.

>        An HTTP/1.1 client or server MAY support any of these
>        content-codings, but SHOULD NOT assume (without explicit
>        evidence) that any other client or server supports any
>        content-coding besides "identity".
>
>(2) Replace section 14.3 (Accept-Encoding) entirely with
>   The Accept-Encoding request-header field restricts the
>   set of content-codings (defined in section 3.5) that are
>   acceptable for the response.  
>
>          Accept-Encoding  = "Accept-Encoding" ":"
>                                    #( content-coding )
>
>   An example of its use is
>
>          Accept-Encoding: compress, gzip
>
>   The content-coding(s) of a response MUST be indicated by a
>   Content-Encoding response-header (section 14.12), unless the
>   "identity" content-coding is the only one used.

The content-encoding header is currently listed as an entity header and not
a response header. This makes sense as a client can upload as well as
download a document using "deflate" or "delta".

>   The meaning of Accept-Encoding is:
>
>	(1) If there is only one content-coding available at the
>	server, or if the server's "normal" content-coding is not
>	"identity", then the server SHOULD send that content-coding,
>	regardless of the presence or absence of an Accept-Encoding
>	header in the request.

I don't like the notion of the "server" having a content-coding.
Content-codings are properties of the resource and not the server. Transfer
codings are properties of the server.

>	(2) If there are multiple content-codings available at the
>	server, and the client does not specify which one it prefers
>	(i.e., the request does not include "Accept-Encoding"), then
>	the server SHOULD send the least-encoded content-coding
>	available.  In particular, if the "identity" content-coding is
>	available, it SHOULD be used.

I don't think this makes sense - what is the "least" encoding of compress
and gzip? (See end of mail for a proposed different wording.)

>	(3) If there are multiple content-codings available at the
>	server, and the request includes an Accept-Encoding header that
>	specifies at least one of these content-codings, then the
>	server SHOULD send the "best" of the matching
>	content-coding(s).
>
>	(4) If there are multiple content-codings available at the
>	server, and the request includes an Accept-Encoding header but
>	there is no intersection between the available and requested
>	content-coding sets, then the server SHOULD return an error
>	response with the 406 (Not Acceptable) status code.
>
>    In cases 2-4 the response SHOULD include a Vary header (section
>    14.43) that includes "Accept-Encoding" in its field-value, and MUST
>    include such a Vary header if the response includes a
>    Content-Coding header.

According to (1), if a document exists in one encoding, identity for
example, then a request like this would return the document:

	GET /no-encoding HTTP/1.1
	Host: some.host
	Accept-Encoding: deflate

but if there are two encodings, gzip and compress say, then according to
(4), the same request would cause the server to return 406.

Why make the difference between whether the resource is available in
multiple  encodings or not? It affects the behavior of the server is ways
that are unpredictable by the client. 

It also handicaps my smart robot which can handle deflate as well as
"save-as" for other unknown encodings. That is, I want the same
functionality as Roy's lwpget but with the addition that I prefer deflate.

This can be done if we introduce a wildcard in the Accept-Encoding, so that
I instead would say

	Accept-Encoding: deflate, *

There is really no difference between content-types and content-encodings
except that old clients fail to understand the latter. However, this can be
fixed by adding your wording for section 14.12 which should probably be
moved to a compatibility appendix.
	
>(3) Modify section 14.12 (Content-Encoding), by adding (to the end
>of the section)
>
>    See section 14.3 for more requirements concerning the use
>    of Content-Encoding.
>
>    When an HTTP/1.1 server or proxy sends a response with
>    a Content-Encoding header, [[and that header specifies
>    a content-coding other than "gzip" or "compress",]]
>    and the corresponding request was either
>	(1) received from a client whose version is lower than
>	HTTP/1.1,
>    or
>	(2) received with a Via header indicating that it was forwarded
>	by a proxy whose version is lower than HTTP/1.1,
>    and the response does not already include an Expires header,
>    then the sender SHOULD include an Expires header whose
>    field-value is identical to the field-value of its Date
>    header.  (This prevents improper caching of encoded responses
>    by HTTP/1.0 proxies.)
>
>(4) Modify section 13.5.2 (Non-modifiable Headers) to
>replace this:
>
>   A cache or non-caching proxy MUST NOT modify any of the following
>   fields in a request or response, nor may it add any of these fields
>   if not already present:
>
>     o  Content-Location
>     o  ETag
>     o  Expires
>     o  Last-Modified
>
>with this:
>
>   A cache or non-caching proxy MUST NOT modify any of the following
>   fields in a request or response, nor may it add any of these fields
>   if not already present:
>
>     o  Content-Location
>     o  ETag
>     o  Last-Modified
>
>   A cache or non-caching proxy MUST NOT modify the following
>   field in a response
>     o  Expires
>   but it MAY add it if not present; if so, it MUST be given
>   a field-value identical to the field-value of the Date header
>   in that response.
>
>I.e., we allow a proxy to add Expires if it has the effect of
>making a response non-cachable.

14.3 Accept-Encoding 

The Accept-Encoding request-header field restricts the set of
content-codings (defined in section 3.5) that are acceptable for the
response.  

       Accept-Encoding  = "Accept-Encoding" ":" 
                                 #( codings )
	codings          = ( content-codings | "*" )

An example of its use is

       Accept-Encoding: compress, gzip

The content-coding(s) of a response MUST be indicated by a Content-Encoding
response-header (section 14.12), unless the "identity" content-coding is
the only one used.

If no Accept-Encoding field is present in a request, the server MAY assume
that the client will accept any content coding. [However, if the available
representations use content-codings including the "identity"
content-coding, and the client does not specify an "Accept-Encoding" field,
then the server SHOULD use the "identity" content-coding; if all available
representations use a non-identity content-coding, then preference should
be given to those content-coding(s) commonly understood by older user
agents, or known to be understood by the particular user agent that
initiated the request.]

If an Accept-Encoding header is present, and if the server cannot send a
response which is acceptable according to the Accept-Encoding header, then
the server SHOULD send an error response with the 406 (Not Acceptable)
status code.

The special content-coding "*", if present in the Accept-Encoding field,
matches every content-coding not matched by any other content-coding
present in the Accept-Encoding field.

An empty Accept-Encoding value indicates that only the "identity"
content-coding is acceptable.

***

The text in [] can be moved to a compatibility appendix, which probably
makes more sense.

Comments?

Henrik
--
Henrik Frystyk Nielsen, <frystyk@w3.org>
World Wide Web Consortium
http://www.w3.org/People/Frystyk

Received on Thursday, 3 July 1997 12:20:53 UTC