Re: [w3ctag/design-reviews] Partial freezing of the User-Agent string (#467) from Marco on 2020-02-07 (public-webapps-github@w3.org from February 2020)

From: Marco <notifications@github.com>
Date: Thu, 06 Feb 2020 16:46:53 -0800
To: w3ctag/design-reviews <design-reviews@noreply.github.com>
Cc: Subscribed <subscribed@noreply.github.com>
Message-ID: <w3ctag/design-reviews/issues/467/583181819@github.com>

I really fail to see the advantages of this proposal (outweighing the downsides):

* Are you concerned about discrimination via the UA string as a small browser vendor? How should turning the current string into a *set* of strings (though with clearer semantics per individual record) ever stop the eternal cat-and-mouse game between websites and browsers, given that websites (indeed like [Google](https://trac.webkit.org/browser/webkit/trunk/Source/WebCore/platform/UserAgentQuirks.cpp)) will always be desperate to differentiate between browsers? If they want a specific Chrome name and version *now*, they will require the *absence* of anything non-Chrome *in the future*. You have Chrome’s entries in your set but you add your own one (e.g. for usage statistics)? Website operators will know that it’s only Real Chrome if there’s nothing else in that set apart from Chrome. Okay, so you will only have Chrome’s entries? That’s exactly the same game as always then, having to match Chrome’s *single string*.
* Are you concerned about the historical baggage of the UA string and its ever-growing length and complexity? As @mcatanzaro’s reply *very clearly* shows, website operators will absolutely make sure that the same baggage will accumulate again quickly, with the help from browser vendors that have to defend their browsers against degraded experiences. How can turning the *string* into a *set* (of strings) stop the fundamental dynamics of this game between website operators, who want to identify and differentiate between specific browsers and versions, and browser vendors, who have no choice but trying to hide their true identity with regard to current forms of identification?
* Are you concerned about the passive fingerprinting surface of the current UA string? Surely passing on *most* of that information to *all* websites and *selectively* granting or denying websites access to more information will level the playing field and ensure equal opportunities for established players and small players (yes, like smaller ad networks) alike.
* Are you convinced that GREASE will help? Perhaps in the very short term, but that’s it. Do you expect website operators to *not* recognize that only `NotBrowser` and `Foo` are being mixed in randomly, while `Epiphany` is indeed a safe sign that the browser at hand is Not Real Chrome? Will Chrome actually start mixing in *real* browser names and versions to give teeth to GREASE? No, Chrome won’t. That would defeat the purpose of the whole feature after all, because now you can never be sure what browser the client is using.
* Are you concerned about backward compatibility, which used to be the holy grail of the web? That can only be guaranteed by never removing the old frozen UA string and `navigator.*` properties. Backward compatibility may already be broken recently in other areas, e.g. with new defaults for cookie attributes. So yes, why care about it here.
* Are you concerned about reducing complexity? Making an established feature obsolete and replacing it with several new components of a similar feature, introducing new permissions or automatically evaluating requests for higher-entropy information based on “trustworthiness” (judged by … perhaps some unaccountable ML?), and requiring hint delegation for CDNs to continue to work will surely not make things easier.
* Are you trying to force (even more) websites to move to HTTPS? The changes to browsers’ UIs and adjusted specifications for cookies will surely do what’s left to do – which is great!

It seems the most reasonable (and by all means *simplest*) solution may be freezing [more and more parts](https://mobile.twitter.com/rmondello/status/943545865204989953) of the UA string, and relying on explicit feature detection otherwise.

And the only two upsides *here* may be the added dimensions in a structured format that website operators could use for content negotiation. But that is only an upside for usability, and yet again a *clear negative* for privacy, which was one of the original goals, and can only be guaranteed through added complexity and an *unlevel playing field*; and perhaps turning passive fingerprinting into active fingerprinting – for those who are dependent on fingerprinting based on UA strings because they don’t yet have vast amounts of other information and activity records.

The fundamental incentives and dynamics are that website operators will always want to know what browser and version a client is using exactly, and will therefore reverse-engineer browser vendors’ implementations, while browser vendors will always try to prevent this to defend the UX of their users while browsing the web. This is by definition a game of cat and mouse that is bound to repeat with a different implementation, which is proposed here. The new concept would probably reset expectations (and implementations) for a little while – and then let the same things happen again. But not without weakening competition at the same time – between those who [already have everything they need to fingerprint devices](https://unsearcher.org/more-on-chrome-updates-and-headers) or will receive higher-entropy information as trusted sites or already-visited hosts, and those who do not.

--
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/w3ctag/design-reviews/issues/467#issuecomment-583181819

Received on Friday, 7 February 2020 00:46:58 UTC