W3C home > Mailing lists > Public > www-international@w3.org > April to June 2008

Re: [Ltru] Language tag education and negotiation

From: Martin Duerst <duerst@it.aoyama.ac.jp>
Date: Wed, 07 May 2008 16:31:55 +0900
Message-Id: <6.0.0.20.2.20080507160835.0a367ec0@localhost>
To: "Mark Davis" <mark.davis@icu-project.org>, "Nicolas Krebs" <nicolas1.krebs3@netcourrier.com>
Cc: www-international@w3.org, ltru@ietf.org

At 05:29 08/05/05, Mark Davis wrote:
>Unfortunately, the specs are ill-defined regarding the q values.

Yes indeed. I have reported this to the HTTP WG.


>Take the following example:
>
>a, b;q=0.7, c, d;q=0.5, e, f;q=0.9, g
>
>The specs do not distinguish between at least two different possible reasonable interpretations of what the q values of c, e, and g are: 
>    * c;q=1.0, e;q=1.0, g;q=1 // always 1 
>    * c;q=0.7, e;q=0.5, g;q=0.9 // always same as previous 
>Our guess is that the user meant #2, but it is only a guess.

My guess would be to use 1.0 as a default, because this is
consistent with other headers, and allows concatenation and
splitting of headers without going through hoops. Please
remember that the above is protocol syntax, not what the
user should see or would edit.

>And even once that ambiguity is cleared up, nobody knows what the meaning of the numbers really is. My native tongue is English, I can understand Swiss German and German, my French is rusty, and Italian basic. Which of these should I use? 
>    * en, gsw, de, fr, it 
>    * en, gsw;q=0.9, de;q=0.9, fr;q=0.8, it;q=0.7 
>    * en, gsw;q=0.5, de;q=0.4, fr;q=0.3, it;q=0.2 
>    * en, gsw;q=0.99, de;q=0.95, fr;q=0.03, it;q=0.02 
>All are descending order, but depending on what algorithm the consumer of the tags uses, they could have very different results. Without being able to have any consistent expectations for what the q numbers mean, producers of tags don't know what settings to provide or what difference it will make, and consumers of tags don't know what the producers meant.

There is a certain aspect of tuning, which indeed currently doesn't work
because there isn't enough actual use on the server side. But basically,
the idea is simple: the values get multiplied with the corresponding
quality values for the documents on the server side.

So the way you should think is that you set gsw to 0.5 if you would
be satisfied to get the English document if the server thinks that the
English document it has is more than half as good than the corresponding
gsw document, but you would prefer to get the gsw document if the server
thinks that the corresponding English document is less than half as
good in quality than the corresponding gsw document.

Of course, "half as good in quality" is then the difficult piece of
information, but ideas for measuring this might be the time it takes
to read for a native reader, the amount of "updateness" (i.e. 70%
of the document are up-to-date -> document quality 0.7), or similar
measures. You can go into very fine details (and they way I know you,
you probably will), but while my considerations above won't help
you figure out the *exact* values for your q-values, they should
be enough to help you decide among the above listed four variants.

My personal guess is that
   en, gsw;q=0.5, de;q=0.4, fr;q=0.3, it;q=0.2
maybe most reasonable, but it's up to you to decide.

Regards,   Martin.



#-#-#  Martin J. Du"rst, Assoc. Professor, Aoyama Gakuin University
#-#-#  http://www.sw.it.aoyama.ac.jp       mailto:duerst@it.aoyama.ac.jp     
Received on Wednesday, 7 May 2008 07:33:13 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 2 June 2009 19:17:17 GMT