Re: ISSUE: LANGUAGE-TAG

Martin J. Duerst:
>
[...]
>Assume we have a very sophisticated server, which knows a lot
>about the documents it keeps or produces, and the languages
>they are kept in. Assume some of this knowledge cannot be
>expressed in one or more language tags attached to the
>documents, including q values.
>
>The question then is whether such a server is allowed to
>cheat, i.e. whether it is e.g. allowed to peek at the
>Accept-Language header field sent in before it calculates
>the language(s) and q values it assigns to a certain
>document.

Yes, it is allowed to cheat, but I would not call it cheating.

The 1.1 spec prescribes how to assign a q value to a language tag
using an Accept-Language header, i.e. there os only one legal way to
compute q-of-en and q-of-ja for English and Japanese.  However, the
spec does not prescribe what you should do with these q-of-* values
once you have computed them.

There is no rule which prescribes that, whenever you have an
English-Japanese bilingual document, jou must compute the overall
quality of this document with the formulae

  max(q-of-en, q-of-ja)

or 

  (q-of-en + q-of-ja)/2

or whatever.  HTTP leaves the selection of appropriate formulae up to
the service author.  In terms of your example below:

>As an example, assume that a server has an English document,
>as Japanese document, and an English-Japanese bilingual
>document. The bilingual document is the source and contains
>all text in the original, and therefore is the preferred
>document. However, if a reader has only English in her
>Accept-language header field, the English document should
>be served, and so on.

the correct formulae for the overall qualities of the documents would,
I think, be something like

       English document:   q-of-en * 0.8
       Japanese document:  q-of-ja * 0.8
       Bilingual document: min(q-of-en,q-of-ja)

(These are correct assuming that you always want to return the
document with the highest overall quality.)

>Regards,	Martin.

Koen.

Received on Monday, 21 July 1997 11:18:56 UTC