- From: Roy T. Fielding <fielding@avron.ICS.UCI.EDU>
- Date: Thu, 16 Mar 1995 12:51:30 -0800
- To: Francois Yergeau <yergeau@alis.ca>
- Cc: http-wg%cuckoo.hpl.hp.com@hplb.hpl.hp.com
> I just read the new RFC 1766 on language tags and found something in > it that seems to clash with the intended usage in HTTP. Section 2.1 > of RFC 1766 contains the following: > > Applications should always treat language tags as a single token; the > division into main tag and subtags is an administrative mechanism, > not a navigation aid. > > I take that to mean that the tags are not hierarchical, i.e. that "en" > is not to be understood as a superset of "en-US". This may be fine for > Content-Language, but will not work, IMHO, for Accept-Language. > > My interpretation is that asking for "en-US" will NOT get you an "en" > document if "en-US" is not available. That may be OK. > > Likewise, asking for "en" will NOT get you "en-US", "en-UK" or any > other "en-SOMETHING". This seems to me to be unnacceptable. It means > that the naive user would have to be aware of all variants of "en-*" > in existence in order to construct the simple request "Send anything > in English". Same for any other language that has variants. Yes, this also struck me as a bit odd. Fortunately, I had a note from the author that essentially said "do the right thing -- treat it as hierarchical". I have no idea what was behind that section of RFC 1766. I just ignored it for the HTTP spec, but I guess an explicit statement would be preferable. > Here is a suggestion to fix this. Simply add the following to section > 8.2 of the HTTP draft, just before the Note: > > In the context of the Accept-Language header (section 5.4.4) a > language tag is not to be interpreted as a single token, as per RFC > 1766, but as a hierarchy. A server should consider that it has a > match when a language tag received in an Accept-Language header > matches the initial portion of the language tag of a document. An > exact match should be preferred. This interpretation allows a > browser to send, for example: > > Accept-Language: en-US, en > > when the intent is to access, in order of preference, documents in > American-English ("en-US"), 'plain' or 'international' English > ("en"), and any other variant of English (initial "en-"). > > I think the above is preferable to changing RFC 1766 and hacking up a > scheme such as "en-*", and is still flexible enough. Any comments? Yes, that will do quite nicely -- I'll add it to the next revision if there are no strong objections. ......Roy Fielding ICS Grad Student, University of California, Irvine USA <fielding@ics.uci.edu> <URL:http://www.ics.uci.edu/dir/grad/Software/fielding>
Received on Thursday, 16 March 1995 13:08:12 UTC