Accept-* header structure (was: Re: Color content negotiation)

Roy Fielding:
>>Let me suggest again "accept-parameter: <parameter><rel><value>"
>>where <parameter> is a parameter to a number of media types (e.g.,
>>charset), <rel> is a relation operator (e.g., =), and <value> is a
>>value specifier consistent with <rel>.
[...]
>And let me point out again the two flaws with this proposal.
>
>  1) There are no common media type parameters other than
>     charset and boundary (and the latter is not negotiable). 
>
>  2) When media types do use the same name for a parameter, they
>     may have entirely different meaning or have incompatible domains.
>     One obvious example is the "level" parameter.

Oh, I think I get it.  What you are saying is that

 Accept: text/html, image/gif, */*;q=.5
 Accept-Charset: iso-8859-5

is a shorthand for

 Accept: text/html;charset=iso-8859-5, image/gif;charset=iso-8859-5,
         */*;charset=iso-8859-5;q=.5

but that

 Accept: text/html, image/gif, */*;q=.5
 Accept-Language: en

is _NOT_ short for

 Accept: text/html;language=en, image/gif;language=en, 
         */*;q=.5;language=en

because you are not allowed to interpret `language' as a MIME type
parameter, you must see it as a variant property completely separate
from the MIME type of the variant.

>Therefore, you cannot negotiate on a parameter other than charset
>without first tying it to the media type which defines it.

This is true for parameters to the _MIME type_ of a variant other than
charset, but not for parameters (properties) that apply directly do
the variant, not the MIME type of the variant.

A variant has a
  - MINE type (which can have parameters)
  - language
  - encoding
  - possibly more properties (like color) that cannot currently be
    negotiated on

We can negotiate on language, encoding, and new things like color just
fine, as long as we do not confuse these variant parameters with the
parameters to the mime type parameter of the variant.

So when generalizing existing and proposed Accept-* headers to one
single header, we should not touch Accept-Charset, and only generalize

 Accept-Encoding,
 Accept-Language,
 Accept-Color (proposed),
 Accept-Cuteness (semi-proposed to prove a point),

without ever mentioning MIME type parameters.

I'm not going to formally propose such a generalization right now,
though I think making this generalization would be a good idea, see
some of my previous messages in this thread.

Right, now, I want to talk about the structure of the existing Accept
header universe.

It seems that there are three primary accept headers:

 Accept            for negotiation on MIME type of a variant
 Accept-Encoding   for negotiation on encoding of a variant
 Accept-Language   for negotiation on language of a variant

and one extra header

 Accept-Charset

that works as a kind of modifier on the contents of the Accept header.

Am I correct so far?

Now, these four header names do not reflect the hierarchy of the
Accept universe very well, and this is confusing.  It is far to easy
to incorrectly infer that either

 a) Accept-Charset, Accept-Encoding, and Accept-Language are all
    modifiers to the contents of the Accept header

or

 b) Accept-Charset, Accept-Encoding, and Accept-Language all specify
    variant properties that have nothing to do with the MIME type
    of the variant.

To solve this confusion, we have two options:

1) Rename headers

  1a) Old name        | New name
      ----------------+------------
      Accept          | Accept-Type
      Accept-Charset  | Accept-Type-Charset
      Accept-Encoding | Accept-Encoding
      Accept-Language | Accept-Language

      and maybe define `Accept' as an obsolete form of `Accept-Type',

  1b) Old name        | New name
      ----------------+------------
      Accept          | Accept
      Accept-Charset  | Accept-Charset
      Accept-Encoding | Want-Encoding
      Accept-Language | Want-Language

      (`Want-' above could be replaced by a better word, as long as it
      is not `Accept-'.)

2) Stop borrowing the `charset' media parameter from the MIME specs,
   and instead define `charset negotiation' to be completely separate
   from MIME type negotiation, just like encoding and language
   negotiation are separate already.

   HTTP already mutates some of the charset stuff taken from the MIME
   specs, so defining charset negotiation completely from scratch in
   the HTTP spec itself does not seem that big a step.  Gateways
   between MIME compliant systems and HTTP already need to do special
   `charset' conversions anyway.

Opinions anyone?  Is the confusion serious enough to require fixing?
Which fix is the best one?

> ....Roy T. Fielding 

Koen.

Received on Friday, 15 September 1995 04:44:50 UTC