Accept-Language prefix matching (was: A broken browser)

Martin wrote:

% Now for the question of prefix matching. The RFC indeed defines
% prefix matching, very clearly and consistently. But this prefix
% matching works only one way:

[definition omitted]

% To give an example, we have the following situation:
% Accept-Language      Document        Match?
% language-range       language-tag
% en                   en              YES
% en-us                en-us           YES
% en                   en-us           YES
% en-us                en              NO?!
% en-us                en-uk           NO?!
% The idea is that Accept-Language defines language-ranges,
% whereas the documents will be tagged exactly. I don't know
% exactly how the group arrived at this asymmetry, but I
% guess the basic thought was that for documents, it would
% be clear whether it was US or British English (and
% likewise in other cases), whereas the user would in
% general not care much about the difference. Prefixes
% (ranges) would therefore be used in Accept-Language, but
% not in document tags.
% Several points lead to the fact that the situation is not
% (or should not be) as asymmetric as described in the RFC.
% - Rarely both en-us and en-uk documents are prepared, and
% 	thus the authors don't care about distinguishing
% 	and just tag them with "en".
% - In some cases, there may be no actual difference, and it
% 	would be strange to label a document as en-us if it
% 	is just as well en-uk.
% - Tagging is in many cases done via file names. Something
% 	such as text.en.html and is preferred
% 	to text.en-us.html and
% - In many cases, language selections on the browser side
% 	are connected to locales. These include a lot of
% 	details where small differences matter, and are
% 	therefore finely granulated. I don't think Windows
% 	or the Mac have something like a "generic English"
% 	configuration.

I personally do not see why a person/browser should ever define 
en-uk unless he/she/it wants to give different q values to 
en-uk and en-us. 

My first reaction would be to add to the definitions that the server,
when receiving an Accept-Language header line which contains a sublanguage
without the father language, MUST (should?) automagically add a q value
for it, which has to be set as (minimum of these)/2 . But I fear that
sometimes this is wrong, since 14.4 says also

      Note: This use of a prefix matching rule does not imply that
     language tags are assigned to languages in such a way that it is
     always true that if a user understands a language with a certain
     tag, then this user will also understand all languages with tags
     for which this tag is a prefix. The prefix rule simply allows the
     use of prefix tags if this is the case.


Received on Thursday, 9 January 1997 07:02:13 UTC