Re: Migrating some high-entropy HTTP headers to Client Hints. from Ronan Cremin on 2019-04-11 (ietf-http-wg@w3.org from April to June 2019)

From: Ronan Cremin <rcremin@afilias.info>
Date: Thu, 11 Apr 2019 12:58:10 +0100
To: Thomas Peterson <hidinginthebbc@gmail.com>, Mike West <mkwst@google.com>, HTTP Working Group <ietf-http-wg@w3.org>
Message-ID: <4d321ba1-f6f1-05c3-5b76-24f6a9b89525@afilias.info>

Hi,

My name is Ronan Cremin, I help to build a device recognition product 
widely-used in the web analytics, publishing and advertising industries. 
Full disclosure: my employer profits from analysis of UA strings, though 
moving the same information to client hints is not expected to impact 
this materially.

One concern over moving UA string information to Client Hints is that 
the information required to publish device-specific responses arrives 
only in the second request from the client. This imposes a performance 
penalty on publishers that serve a device-tailored HTML document. As 
Mike mentioned, RWD notwithstanding, many publishers employ 
device-specific responses as envisaged in RFC1945, usually to tailor the 
experience to a class of device e.g. smartphone, tablet, desktop and so 
on. Publishers endeavour to fit everything required for the first screen 
of content into this first response, so a delay to this impacts 
performance. The last time I checked more than 80% of the top 100 
websites used this technique.

Web analytics might also be impacted. Most web analytics solutions 
support a JavaScript-free integration approach based on linking a single 
pixel image hosted by the analytics platform. The ability to do this is 
impacted for the same reason—the information required for analytics 
becomes available only on the second request from the client.

Has thought been given to the performance impact of the proposal? Yoav 
mentions this issue in his Client Hints infrastructure document 
(https://github.com/yoavweiss/client-hints-infrastructure) but I haven't 
seen any attempt to quantify the impact.

Regards,
Ronan

On 29/11/2018 12:08, Thomas Peterson wrote:
> I would propose that all Accept* headers are included in Client Hints 
> as all can be used for some level of fingerprinting, e.g. Accept can 
> used to distinguish between desktop browsers (which typically have 
> html/xml MIME types) and cURL/wget which by default have '*/*'. Many 
> user agents also do their own guess work on response bodies anyway 
> (such as looking at the magic number) to determine content type or 
> encoding, so the impact of a "failed negotiation" of content can be 
> limited.
>
> Also, Is there a particular reason why Sec-CH-Lang omits Quality Values?
>
>
> Regards
>
>
> On 29/11/2018 10:22, Mike West wrote:
>> Hey folks,
>>
>> Section 9.7 of RFC7231 
>> <https://tools.ietf.org/html/rfc7231#section-9.7> rightly notes that 
>> some of the content negotiation headers user agents deliver in HTTP 
>> requests create substantial fingerprinting surface. I think it would 
>> be beneficial if we took steps to reduce their prevalence on the 
>> wire, and Client Hints looks like a reasonable infrastructure on top 
>> of which to build.
>>
>> `User-Agent` and `Accept-Language` seem like particularly tasty and 
>> low-hanging fruit, and I've sketched out two proposals as proofs of 
>> concept:
>>
>> *   `User-Agent` could be represented as ~four distinct hints: `UA`, 
>> `Model`, `Platform`, and `Arch`: 
>> https://github.com/mikewest/ua-client-hints is a high-level 
>> explainer, and https://tools.ietf.org/html/draft-west-ua-client-hints 
>> a sketchy ID for the new headers.
>>
>> *   `Accept-Language` could be represented as a `Lang` hint: 
>> https://github.com/mikewest/lang-client-hint is a high-level 
>> explainer, https://tools.ietf.org/html/draft-west-lang-client-hint an 
>> equally sketchy ID for the new header.
>>
>> I'd appreciate y'all's feedback. Thanks!
>>
>> -mike
>

Received on Thursday, 11 April 2019 12:09:27 UTC