(ACCEPT*) Draft text for Accept headers

As I said in my earlier message about strategy:

- There will be Accept* headers in the 1.1 draft.  With respect to the
Accept* headers previous draft, numerous small improvements are made;
these improvements reflect the consensus of the content negotiation
subgroup.

The sections below are proposed as replacements of the sections with
the same numbers and names in the current 1.1 draft.

The text below was mostly lifted from draft-holtman, but does have
some changes with respect to that draft.  All references to content
negotiation were deleted.  I deleted the mxb= parameter in the Accept
header, though it could be re-introduced later.

The main improvement with respect to the old draft is in clarifying
the matching rules of the Accept-Language header.

If you have comments on this text, now is the time to comment.  I
intend to close this issue at the end of the week.  This means that I
will send a last call for disagreement with perceived consensus,
together with a possibly improved version of the text below, in a few
days.

The change bars below were added by hand, and indicate change with
respect to the last 1.1 draft.  Merely moved text was not marked with
change bars.

--snip--

3  Protocol parameter descriptions

3.10  Language Tags

   [##Note: I moved the language tag matching discussion that used to
   be in this Section to Section 10.4 (Accept-Language).  Some other
   minor edits were made.##]

   A language tag identifies a natural language spoken, written, or
   otherwise conveyed by human beings for communication of information
   to other human beings. Computer languages are explicitly excluded.
   HTTP/1.1 uses language tags within the Accept-Language and
|  Content-Language fields.

   The syntax and registry of HTTP language tags is the same as that
   defined by RFC 1766 [1]. In summary, a language tag is composed of 1
   or more parts: A primary language tag and a possibly empty series of
   subtags:

        language-tag  = primary-tag *( "-" subtag )

        primary-tag   = 1*8ALPHA
        subtag        = 1*8ALPHA

   Whitespace is not allowed within the tag and all tags are
   case-insensitive. The namespace of language tags is administered by
   the IANA. Example tags include:

       en, en-US, en-cockney, i-cherokee, x-pig-latin

   where any two-letter primary-tag is an ISO 639 language abbreviation
   and any two-letter initial subtag is an ISO 3166 country code.

|
|

9. Status Code Definitions

413 Not Acceptable

   [#Note: the new 413 is similar to the 406 response code in the old
   draft.  406 cannot be used for content negotiation compatibility
   reasons##]

|  The resource identified by the Request-URI and Host request header
|  (present if the request-URI is not an absoluteURI) is only capable
|  of generating response entities which have content characteristics
|  not acceptable according to the accept headers sent in the request.

|  HTTP/1.1 servers are allowed to return responses which are not
|  acceptable according to the accept headers sent in the request.  In
|  some cases, this may even be preferable over sending a 408
|  response.  User agents are encouraged to inspect the headers of an
|  incoming response to determine if it is acceptable.  If this is not
|  the case, user agents should interrupt the receipt of the response
|  if doing so would save network resources.  If it it unknown whether
|  an incoming response would be acceptable, a user agent should
|  temporarily stop receipt of more data and query the user for a
|  decision on further actions.

   [## Note: the paragraph above could be moved to a more convenient
   location in the 1.1 document if the editor finds one.  Note that
   the above rule was discussed extensively on the content negotiation
   mailing list.  A short summary of the main reason behind this rule:
   20 line HTTP servers.##]


10  Header field definitions

10.1  Accept

|  The Accept request-header field can be used to specify certain
   media types which are acceptable for the response.  Accept headers
   can be used to indicate that the request is specifically limited to
   a small set of desired types, as in the case of a request for an
   in-line image.

   The field may be folded onto several lines and more than one
   occurrence of the field is allowed, with the semantics being the same
   as if all the entries had been in one field value.

       Accept         = "Accept" ":" #(
                        ( media-range
|                         [ ( ":" | ";" ) 
|                           range-parameter 
|                           *( ";" range-parameter ) ] )
|                       | extension-token )

       media-range     = ( "*/*"
                       |   ( type "/" "*" )
                       |   ( type "/" subtype )
                         ) *( ";" parameter )

|      range-parameter = ( "q" "=" qvalue ) 
|                      | extension-range-parameter
|
|      extension-range-parameter = ( token "=" token )
|
|      extension-token = token 

   The asterisk "*" character is used to group media types into ranges,
   with "*/*" indicating all media types and "type/*" indicating all
   subtypes of that type.

|  The range-parameter q is used to indicate the media type quality
|  factor for the range, which represents the user's preference for
|  that range of media types.  The default value is q=1.  In Accept
|  headers generated by HTTP/1.1 clients, the character separating
|  media-ranges from range-parameters should be a ":".  HTTP/1.1
|  servers should be tolerant of use of the ";" separator by HTTP/1.0
|  clients.

   The example

       Accept: audio/*:q=0.2, audio/basic

   should be interpreted as "I prefer audio/basic, but send me any audio
   type if it is the best available after an 80% mark-down in quality."

   If no Accept header is present, then it is assumed that the client
|  accepts all media types.  If Accept headers are present, and if the
|  resource cannot send a response which is acceptable according to
|  the Accept headers, then the server should send an error response
|  with the 413 (not acceptable) status code, though the sending of an
|  un-acceptable response is also allowed.

   A more elaborate example is

|      Accept: text/plain:q=0.5, text/html,
|              text/x-dvi:q=0.8, text/x-c

   Verbally, this would be interpreted as "text/html and text/x-c are
   the preferred media types, but if they do not exist, then send the
|  text/x-dvi entity, and if that does not exist, send the text/plain
   entity."

   Media ranges can be overridden by more specific media ranges or
   specific media types. If more than one media range applies to a given
   type, the most specific reference has precedence. For example,

|      Accept: text/*, text/html, text/html;level=1, */*

   have the following precedence:

|      1) text/html;level=1
       2) text/html
       3) text/*
       4) */*

|  The media type quality factor associated with a given type is
   determined by finding the media range with the highest precedence
   which matches that type.

   For example,

|      Accept: text/*:q=0.3, text/html:q=0.7, text/html;level=1,
               */*:q=0.5

   would cause the following type quality factors to be associated:

|      text/html;level=1                          = 1
       text/html                                  = 0.7
       text/plain                                 = 0.3
       image/jpeg                                 = 0.5
|      text/html;level=3                          = 0.7

|  

       Note: A user agent may be provided with a default set of 
       quality values for certain media ranges. However, unless the 
       user agent is a closed system which cannot interact with 
       other rendering agents, this default set should be 
       configurable by the user.


10.2  Accept-Charset

   The Accept-Charset request-header field can be used to indicate what
   character sets are acceptable for the response. This field allows
   clients capable of understanding more comprehensive or
   special-purpose character sets to signal that capability to a server
   which is capable of representing documents in those character
   sets. The US-ASCII character set can be assumed to be acceptable to
   all user agents.

|  [##QUESTION TO BE RESOLVED: Apparently, the latest HTML spec says
|  that iso-8859-1 can be assumed to be acceptable to all user agents.
|  Should the above US-ASCII be changed to iso-8859-1??  There has
|  been lots of discussion on the list, but I have not been able to
|  detect a consensus opinion.##]

       Accept-Charset = "Accept-Charset" ":" 1#charset

   Character set values are described in Section 3.4. An example is

       Accept-Charset: iso-8859-1, unicode-1-1

|  If no Accept-Charset header is present, the default is that any
|  character set is acceptable.  If an Accept-Charset header is
|  present, and if the resource cannot send a response which is
|  acceptable according to the Accept-Charset header, then the server
|  should send an error response with the 413 (not acceptable) status
|  code, though the sending of an un-acceptable response is also
|  allowed.


10.3  Accept-Encoding

   The Accept-Encoding request-header field is similar to Accept, but
   restricts the content-coding values (Section 3.5) which are
   acceptable in the response.

       Accept-Encoding         = "Accept-Encoding" ":" 
                                 #( content-coding )

   An example of its use is

       Accept-Encoding: compress, gzip

   If no Accept-Encoding header is present in a request, the server
   may assume that the client will accept any content coding.  If an
|  Accept-Encoding header is present, and if the resource cannot send
|  a response which is acceptable according to the Accept-Encoding
|  header, then the server should send an error response with the
|  413 (not acceptable) status code.


10.4  Accept-Language

   The Accept-Language request-header field is similar to Accept, but
   restricts the set of natural languages that are preferred as a
   response to the request.

       Accept-Language = "Accept-Language" ":"
|                        1#( language-range [ ";" "q" "=" qvalue ] )

|      language-range = ( ( 1*8ALPHA *( "-" 1*8ALPHA ) )
|                       | "*" )

|  Each language-range may be given an associated quality value which
|  represents an estimate of the user's comprehension of the languages
|  specified by that range.  The quality value defaults to "q=1" (100%
   comprehension).  For example,

|      Accept-Language: da, en-gb;q=0.8, en;q=0.7
|  
|  would mean: "I prefer Danish, but will accept British English (with
|  80% comprehension) and other types of English (with 70%
|  comprehension)."

|  A language-range matches a language-tag if it exactly equals the tag,
|  or if it is a prefix of the tag such that the first tag character
|  following the prefix is "-".  The special range "*", if present in
|  the Accept-Language field, matches every tag not matched by any other
|  ranges present in the Accept-Language field.

|      Note: This use of a prefix matching rule does not imply that
|      language tags are assigned to languages in such a way that it is
|      always true that if a user understands a language with a certain
|      tag, then this user will also understand all languages with tags
|      for which this tag is a prefix.  The prefix rule simply allows
|      the use of prefix tags if this is the case.

|  The language quality factor assigned to a language-tag by the
|  Accept-Language field is the quality value of the longest
|  language-range in the field that matches the language-tag.  If no
|  language-range in the field matches the tag, the language quality
|  factor assigned is 0.

   If no Accept-Language header is present in a request, the server
|  should assume that all languages are equally acceptable.  If an
|  Accept-Language header is present, and the resource cannot send a
|  response acceptable to an audience capable of understanding at
|  least one language that is assigned a quality factor greater than 0
|  by the Accept-Language header, it is acceptable to send a response
|  that uses one or more un-accepted languages.

|  It may be contrary to be privacy expectations of the user to send
|  an Accept-Language header with the complete linguistic preferences
|  of the user in every request.  For a discussion of this issue, see
|  Section aa.bb [##see below##].

       Note: As intelligibility is highly dependent on the 
       individual user, it is recommended that client applications 
       make the choice of linguistic preference available to the 
       user. If the choice is not made available, then the 
       Accept-Language header field must not be given in the 
       request.  

14 Security Considerations

14.x  Privacy issues connected to Accept headers

   [## Note: I believe someone else (Brian Behlendorf?) was also
   writing text about this, so I only include some concerns about
   Accept-Language important from a European viewpoint.  The concern
   of user tracking through Accept headers is not covered below, see
   Section 6.2 of draft-holtman for a discussion of this concern##]

|  Accept request headers can reveal information about the user to all
|  servers which are accessed.  The Accept-Language header in
|  particular can reveal information the user would consider to be of
|  a private nature, because the understanding of particular languages
|  is often strongly correlated to the membership of a particular
|  ethnic group.  User agents which offer the option to configure the
|  contents of an Accept-Language header to be sent in every request
|  are strongly encouraged to let the configuration process include a
|  message which makes the user aware of the loss of privacy involved.
|  An approach that limits the loss of privacy would be for a user
|  agent to omit the sending of Accept-Language headers by default,
|  and to ask the user whether it should to start sending
|  Accept-Language headers to a server if it detects, by looking for
|  at any Vary or Alternates response headers generated by the server,
|  that such sending could improve the quality of service.



[End of document]

Received on Monday, 1 April 1996 13:55:27 UTC