- From: Roy T. Fielding <fielding@avron.ICS.UCI.EDU>
- Date: Thu, 16 Mar 1995 12:51:30 -0800
- To: Francois Yergeau <yergeau@alis.ca>
- Cc: http-wg%cuckoo.hpl.hp.com@hplb.hpl.hp.com
> I just read the new RFC 1766 on language tags and found something in
> it that seems to clash with the intended usage in HTTP. Section 2.1
> of RFC 1766 contains the following:
>
> Applications should always treat language tags as a single token; the
> division into main tag and subtags is an administrative mechanism,
> not a navigation aid.
>
> I take that to mean that the tags are not hierarchical, i.e. that "en"
> is not to be understood as a superset of "en-US". This may be fine for
> Content-Language, but will not work, IMHO, for Accept-Language.
>
> My interpretation is that asking for "en-US" will NOT get you an "en"
> document if "en-US" is not available. That may be OK.
>
> Likewise, asking for "en" will NOT get you "en-US", "en-UK" or any
> other "en-SOMETHING". This seems to me to be unnacceptable. It means
> that the naive user would have to be aware of all variants of "en-*"
> in existence in order to construct the simple request "Send anything
> in English". Same for any other language that has variants.
Yes, this also struck me as a bit odd. Fortunately, I had a note from
the author that essentially said "do the right thing -- treat it as
hierarchical". I have no idea what was behind that section of RFC 1766.
I just ignored it for the HTTP spec, but I guess an explicit statement
would be preferable.
> Here is a suggestion to fix this. Simply add the following to section
> 8.2 of the HTTP draft, just before the Note:
>
> In the context of the Accept-Language header (section 5.4.4) a
> language tag is not to be interpreted as a single token, as per RFC
> 1766, but as a hierarchy. A server should consider that it has a
> match when a language tag received in an Accept-Language header
> matches the initial portion of the language tag of a document. An
> exact match should be preferred. This interpretation allows a
> browser to send, for example:
>
> Accept-Language: en-US, en
>
> when the intent is to access, in order of preference, documents in
> American-English ("en-US"), 'plain' or 'international' English
> ("en"), and any other variant of English (initial "en-").
>
> I think the above is preferable to changing RFC 1766 and hacking up a
> scheme such as "en-*", and is still flexible enough. Any comments?
Yes, that will do quite nicely -- I'll add it to the next revision
if there are no strong objections.
......Roy Fielding ICS Grad Student, University of California, Irvine USA
<fielding@ics.uci.edu>
<URL:http://www.ics.uci.edu/dir/grad/Software/fielding>
Received on Thursday, 16 March 1995 13:08:12 UTC