Re: Migrating some high-entropy HTTP headers to Client Hints.

Thanks for the feedback!

On Thu, Nov 29, 2018 at 1:08 PM Thomas Peterson <hidinginthebbc@gmail.com>
wrote:

> I would propose that all Accept* headers are included in Client Hints as
> all can be used for some level of fingerprinting, e.g. Accept can used
> to distinguish between desktop browsers (which typically have html/xml
> MIME types) and cURL/wget which by default have '*/*'.


The philosophy in https://tools.ietf.org/html/draft-west-ua-client-hints is
that it's reasonable to expose basic information about the user agent (e.g.
it's Firefox, not cURL). That level of information seems quite difficult to
hide (given differences in behavior, network stacks, etc.) and quite
valuable to developers, which tips the balance for me towards exposing
brand and major version by default.

With that in mind, `Accept` and `Accept-Encoding` seem to be fairly static
in their relationship to the UA brand and version. Chrome more or less
hard-codes `Accept` and `Accept-Encoding` based on the kind of resource
being asked for, for instance (see
https://cs.chromium.org/chromium/src/net/url_request/url_request_http_job.cc?g=0&l=666
and
places like
https://cs.chromium.org/chromium/src/media/blink/resource_multibuffer_data_provider.cc?rcl=e53d19f7befd7927b6b9727dc88b9ee295c6fa05&l=110
 and
https://cs.chromium.org/chromium/src/content/renderer/loader/web_url_loader_impl.cc?rcl=e53d19f7befd7927b6b9727dc88b9ee295c6fa05&l=672
).

With the caveat that I'm sometimes prone to a myopic view of the world from
the standpoint of a web browser: `User-Agent` and `Accept-Language` seem to
contain significantly more entropy, and therefore feel like the right place
to start. I certainly wouldn't suggest that that's where we ought to stop.
:)


> Many user agents
> also do their own guess work on response bodies anyway (such as looking
> at the magic number) to determine content type or encoding, so the
> impact of a "failed negotiation" of content can be limited.
>
> Also, Is there a particular reason why Sec-CH-Lang omits Quality Values?
>

https://tools.ietf.org/html/draft-west-lang-client-hint-00#section-4.3
addresses this. In a nutshell, it seems like cruft, and some widely-used
user agents (I spot-checked Chrome and Firefox) implement the weighting
mechanism as a function of the list order. That semantic makes sense to me,
and more doesn't seem to be necessary.

I might well be missing use cases here, which I'd be thrilled to hear about!

-mike


>
> Regards
>
>
> On 29/11/2018 10:22, Mike West wrote:
> > Hey folks,
> >
> > Section 9.7 of RFC7231
> > <https://tools.ietf.org/html/rfc7231#section-9.7> rightly notes that
> > some of the content negotiation headers user agents deliver in HTTP
> > requests create substantial fingerprinting surface. I think it would
> > be beneficial if we took steps to reduce their prevalence on the wire,
> > and Client Hints looks like a reasonable infrastructure on top of
> > which to build.
> >
> > `User-Agent` and `Accept-Language` seem like particularly tasty and
> > low-hanging fruit, and I've sketched out two proposals as proofs of
> > concept:
> >
> > *   `User-Agent` could be represented as ~four distinct hints: `UA`,
> > `Model`, `Platform`, and `Arch`:
> > https://github.com/mikewest/ua-client-hints is a high-level explainer,
> > and https://tools.ietf.org/html/draft-west-ua-client-hints a sketchy
> > ID for the new headers.
> >
> > *   `Accept-Language` could be represented as a `Lang` hint:
> > https://github.com/mikewest/lang-client-hint is a high-level
> > explainer, https://tools.ietf.org/html/draft-west-lang-client-hint an
> > equally sketchy ID for the new header.
> >
> > I'd appreciate y'all's feedback. Thanks!
> >
> > -mike
>

Received on Thursday, 29 November 2018 12:55:27 UTC