W3C home > Mailing lists > Public > ietf-http-wg@w3.org > January to March 2013

Re: #428 Accept-Language ordering for identical qvalues

From: Mark Nottingham <mnot@mnot.net>
Date: Mon, 21 Jan 2013 13:06:47 +1100
Cc: ietf-http-wg@w3.org
Message-Id: <316F5F01-1C9F-4077-B400-68CDE6B391CA@mnot.net>
To: Amos Jeffries <squid3@treenet.co.nz>
That's interesting, thanks. 

One thing to add; even if the client includes a q=0, the server can still ignore it. 

Cheers,

P.S. If you are able (considering privacy issues, etc.) and want to dump such data in a useable format, feel free to ask for a repository on the github account.



On 21/01/2013, at 12:56 PM, Amos Jeffries <squid3@treenet.co.nz> wrote:

> On 21/01/2013 12:30 p.m., Adrien W. de Croy wrote:
>>  ------ Original Message ------
>> From: "James M Snell" <jasnell@gmail.com <mailto:jasnell@gmail.com>>
>>> 
>>> +1.. in fact, for 2.0, I'd very much like to get rid of q-values entirely and depend entirely on order.
>>> 
>> same here.
>> The idea may have been laudable in 1998, but really, how can a web server tell if some resource is 80% better than another? A human needs to tell it, and humans have enough trouble with other things.
>> the q=0 option would need to be turned into a Naccept-* header or something.   But does anyone even use it outside of testing for 406 responses which never come?
> 
> My collection of 2 years worth of language headers says no.
> 
> Of 2018 unique Accept-Language header field-values;
>  1532 are using q-values in a strictly sorted list
>  491 are not using q-values
>  14 are using "q=0.0".
>  5 are using q-values and non-qvalues without ordering the sent list (1 looks otherwise normal, teh others are using puny-codes)
> 
> The 14 are also unique in being very long and having multiple entries with equal q-values. They are still without exception strictly ordered with the entries having no q-value entries first (as if q=1.0 was used for sort but omitted sending). They are also containing a number of oddities such as multiple entries for language codes with differing q-values.
> 
> NP: Of those 14 odd A-L headers noted above I have UA details on 8 of them. All claim to be Firefox but the Gecko dates do not line up with other info on those versions (the 11.0 was released some years before 3.5.9 on the same OS) so the whole input is a bit suspect.
> 
> 
> The 5 cases un-ordered list have puny-code values with no q-value being listed after an otherwise normal series of languages. Like so:
> "en-us,en;q=0.5,x-ns1qHkbtrt8Nhv,x-ns2E1e0Nnym7b6"
> 
> I have a few cases of q-value ordered list followed by wildcard "*" with no q-value. Sender obviously assuming the list is ordered.
> 
> 
> 
> Broken down by UA, which I started ~6 months ago at Juliens suggestion I have 54289 distinct UA visiting, of which;
>  21756 are not sending A-L header at all
>  19621 unique UA are using a single language code with no q-value
>  12495 unique UA are using q-values as above.
>  8 are sending only wildcard "*" or "*/*"
> 
> The remainder ~400 roughly match up with the 491 AL field-values not using q-values. Are older agents (Windows 98, NT, 2k stand out), agents sending the same language multiple times (VoilaBot variants and Safari there), or sending sub-language variants with the generic form last eg "en-GB,en", "en-US,en", "en-US,en,*" (Tablets and Mobile Safari mostly). Obviously assuming sorted lists even back into the Windows 98 ones.
> 
> There are also a few bots sending exactly 2 puny-code entries.
> 
> 
> Amos
> 

--
Mark Nottingham   http://www.mnot.net/
Received on Monday, 21 January 2013 02:07:18 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Monday, 21 January 2013 02:07:20 GMT