- From: Koen Holtman <koen@win.tue.nl>
- Date: Fri, 5 Apr 1996 00:49:52 +0200 (MET DST)
- To: http-wg%cuckoo.hpl.hp.com@hplb.hpl.hp.com
- Cc: Koen Holtman <koen@win.tue.nl>
I posted proposed draft text for the Accept headers on Monday. Since then, I received comments about US-ASCII vs. iso-8859-1 as the character set can be assumed to be acceptable to all user agents, about q= parameters on the Accept-Charset header, and on the rule that servers may always choose to ignore Accept headers. Using some of these comments, I adjusted the text. Updated text is included below. If I get no new comments on the new text within 2 days, I will declare consensus on this text and close this issue. Unless further comment is heard after that, the issue will remain closed. Details on the comments received: * On US-ASCII vs. iso-8859-1, the general opinion seems to be that iso-8859-1 is the character that set can be assumed to be acceptable to all user agents. I have performed the necessary update to reflect this opinion below. * On q= parameters on the Accept-Charset header: I added these parameters to the text below, after receiving 2 votes in favor and 1 vote against. * On the rule that servers may always choose to ignore Accept headers: A message from someone who did not agree with this rule was just posted, I will send a reply with a more detailed explanation of the background of this rule tomorrow. Updated text is included below. | change bars indicate changes with respect to the previous 1.1 draft, + change bars indicated changes with respect to the text posted on Monday. Koen. --snip-- 3 Protocol parameter descriptions 3.10 Language Tags [##Note: I moved the language tag matching discussion that used to be in this Section to Section 10.4 (Accept-Language). Some other minor edits were made.##] A language tag identifies a natural language spoken, written, or otherwise conveyed by human beings for communication of information to other human beings. Computer languages are explicitly excluded. HTTP/1.1 uses language tags within the Accept-Language and | Content-Language fields. The syntax and registry of HTTP language tags is the same as that defined by RFC 1766 [1]. In summary, a language tag is composed of 1 or more parts: A primary language tag and a possibly empty series of subtags: language-tag = primary-tag *( "-" subtag ) primary-tag = 1*8ALPHA subtag = 1*8ALPHA Whitespace is not allowed within the tag and all tags are case-insensitive. The namespace of language tags is administered by the IANA. Example tags include: en, en-US, en-cockney, i-cherokee, x-pig-latin where any two-letter primary-tag is an ISO 639 language abbreviation and any two-letter initial subtag is an ISO 3166 country code. | | 9. Status Code Definitions 413 Not Acceptable [##Note: the new 413 is similar to the 406 response code in the old draft. 406 cannot be used for content negotiation compatibility reasons##] [##Note added later: I believe 413 is already taken. The draft editor should change every 413 in this text to an unassigned 4xx number.##] | The resource identified by the Request-URI and Host request header | (present if the request-URI is not an absoluteURI) is only capable | of generating response entities which have content characteristics | not acceptable according to the accept headers sent in the request. | HTTP/1.1 servers are allowed to return responses which are not | acceptable according to the accept headers sent in the request. In | some cases, this may even be preferable over sending a 408 | response. User agents are encouraged to inspect the headers of an | incoming response to determine if it is acceptable. If this is not | the case, user agents should interrupt the receipt of the response | if doing so would save network resources. If it it unknown whether | an incoming response would be acceptable, a user agent should | temporarily stop receipt of more data and query the user for a | decision on further actions. [## Note: the paragraph above could be moved to a more convenient location in the 1.1 document if the editor finds one. Note that the above rule was discussed extensively on the content negotiation mailing list. A short summary of the main reason behind this rule: 20 line HTTP servers.##] 10 Header field definitions 10.1 Accept | The Accept request-header field can be used to specify certain media types which are acceptable for the response. Accept headers can be used to indicate that the request is specifically limited to a small set of desired types, as in the case of a request for an in-line image. The field may be folded onto several lines and more than one occurrence of the field is allowed, with the semantics being the same as if all the entries had been in one field value. Accept = "Accept" ":" #( ( media-range | [ ( ":" | ";" ) | range-parameter | *( ";" range-parameter ) ] ) | | extension-token ) media-range = ( "*/*" | ( type "/" "*" ) | ( type "/" subtype ) ) *( ";" parameter ) | range-parameter = ( "q" "=" qvalue ) | | extension-range-parameter | | extension-range-parameter = ( token "=" token ) | | extension-token = token The asterisk "*" character is used to group media types into ranges, with "*/*" indicating all media types and "type/*" indicating all subtypes of that type. | The range-parameter q is used to indicate the media type quality | factor for the range, which represents the user's preference for | that range of media types. The default value is q=1. In Accept | headers generated by HTTP/1.1 clients, the character separating | media-ranges from range-parameters should be a ":". HTTP/1.1 | servers should be tolerant of use of the ";" separator by HTTP/1.0 | clients. The example Accept: audio/*:q=0.2, audio/basic should be interpreted as "I prefer audio/basic, but send me any audio type if it is the best available after an 80% mark-down in quality." If no Accept header is present, then it is assumed that the client | accepts all media types. If Accept headers are present, and if the | resource cannot send a response which is acceptable according to | the Accept headers, then the server should send an error response | with the 413 (not acceptable) status code, though the sending of an | un-acceptable response is also allowed. A more elaborate example is | Accept: text/plain:q=0.5, text/html, | text/x-dvi:q=0.8, text/x-c Verbally, this would be interpreted as "text/html and text/x-c are the preferred media types, but if they do not exist, then send the | text/x-dvi entity, and if that does not exist, send the text/plain entity." Media ranges can be overridden by more specific media ranges or specific media types. If more than one media range applies to a given type, the most specific reference has precedence. For example, | Accept: text/*, text/html, text/html;level=1, */* have the following precedence: | 1) text/html;level=1 2) text/html 3) text/* 4) */* | The media type quality factor associated with a given type is determined by finding the media range with the highest precedence which matches that type. For example, | Accept: text/*:q=0.3, text/html:q=0.7, text/html;level=1, */*:q=0.5 would cause the following type quality factors to be associated: | text/html;level=1 = 1 text/html = 0.7 text/plain = 0.3 image/jpeg = 0.5 | text/html;level=3 = 0.7 | Note: A user agent may be provided with a default set of quality values for certain media ranges. However, unless the user agent is a closed system which cannot interact with other rendering agents, this default set should be configurable by the user. 10.2 Accept-Charset The Accept-Charset request-header field can be used to indicate what character sets are acceptable for the response. This field allows clients capable of understanding more comprehensive or special-purpose character sets to signal that capability to a server which is capable of representing documents in those character + sets. The ISO-8859-1 character set can be assumed to be acceptable to all user agents. Accept-Charset = "Accept-Charset" ":" + 1#( charset [ ";" "q" "=" qvalue ] ) + Character set values are described in Section 3.4. Each charset + may be given an associated quality value which represents the + user's preference for that charset. The default value is q=1. An + example is + Accept-Charset: iso-8859-5, unicode-1-1;q=0.8 | If no Accept-Charset header is present, the default is that any | character set is acceptable. If an Accept-Charset header is | present, and if the resource cannot send a response which is | acceptable according to the Accept-Charset header, then the server | should send an error response with the 413 (not acceptable) status | code, though the sending of an un-acceptable response is also | allowed. 10.3 Accept-Encoding The Accept-Encoding request-header field is similar to Accept, but restricts the content-coding values (Section 3.5) which are acceptable in the response. Accept-Encoding = "Accept-Encoding" ":" #( content-coding ) An example of its use is Accept-Encoding: compress, gzip If no Accept-Encoding header is present in a request, the server may assume that the client will accept any content coding. If an | Accept-Encoding header is present, and if the resource cannot send | a response which is acceptable according to the Accept-Encoding | header, then the server should send an error response with the | 413 (not acceptable) status code. 10.4 Accept-Language The Accept-Language request-header field is similar to Accept, but restricts the set of natural languages that are preferred as a response to the request. Accept-Language = "Accept-Language" ":" | 1#( language-range [ ";" "q" "=" qvalue ] ) | language-range = ( ( 1*8ALPHA *( "-" 1*8ALPHA ) ) | | "*" ) | Each language-range may be given an associated quality value which | represents an estimate of the user's comprehension of the languages | specified by that range. The quality value defaults to "q=1" (100% comprehension). For example, | Accept-Language: da, en-gb;q=0.8, en;q=0.7 | | would mean: "I prefer Danish, but will accept British English (with | 80% comprehension) and other types of English (with 70% | comprehension)." | A language-range matches a language-tag if it exactly equals the tag, | or if it is a prefix of the tag such that the first tag character | following the prefix is "-". The special range "*", if present in | the Accept-Language field, matches every tag not matched by any other | ranges present in the Accept-Language field. | Note: This use of a prefix matching rule does not imply that | language tags are assigned to languages in such a way that it is | always true that if a user understands a language with a certain | tag, then this user will also understand all languages with tags | for which this tag is a prefix. The prefix rule simply allows | the use of prefix tags if this is the case. | The language quality factor assigned to a language-tag by the | Accept-Language field is the quality value of the longest | language-range in the field that matches the language-tag. If no | language-range in the field matches the tag, the language quality | factor assigned is 0. If no Accept-Language header is present in a request, the server | should assume that all languages are equally acceptable. If an | Accept-Language header is present, and the resource cannot send a | response acceptable to an audience capable of understanding at | least one language that is assigned a quality factor greater than 0 | by the Accept-Language header, it is acceptable to send a response | that uses one or more un-accepted languages. | It may be contrary to be privacy expectations of the user to send | an Accept-Language header with the complete linguistic preferences | of the user in every request. For a discussion of this issue, see | Section aa.bb [##see below##]. Note: As intelligibility is highly dependent on the individual user, it is recommended that client applications make the choice of linguistic preference available to the user. If the choice is not made available, then the Accept-Language header field must not be given in the request. 14 Security Considerations 14.x Privacy issues connected to Accept headers [## Note: I believe someone else (Brian Behlendorf?) was also writing text about this, so I only include some concerns about Accept-Language important from a European viewpoint. The concern of user tracking through Accept headers is not covered below, see Section 6.2 of draft-holtman for a discussion of this concern##] | Accept request headers can reveal information about the user to all | servers which are accessed. The Accept-Language header in | particular can reveal information the user would consider to be of | a private nature, because the understanding of particular languages | is often strongly correlated to the membership of a particular | ethnic group. User agents which offer the option to configure the | contents of an Accept-Language header to be sent in every request | are strongly encouraged to let the configuration process include a | message which makes the user aware of the loss of privacy involved. | An approach that limits the loss of privacy would be for a user | agent to omit the sending of Accept-Language headers by default, | and to ask the user whether it should to start sending | Accept-Language headers to a server if it detects, by looking for | at any Vary or Alternates response headers generated by the server, | that such sending could improve the quality of service. [End of document.]
Received on Thursday, 4 April 1996 14:58:00 UTC