fingerprinting guidance feedback and alternatives from Nick Doty on 2015-08-18 (public-privacy@w3.org from July to September 2015)

From: Nick Doty <npdoty@w3.org>
Date: Mon, 17 Aug 2015 23:55:09 -0700
To: Melvin Carvalho <melvincarvalho@gmail.com>
Cc: "public-privacy (W3C mailing list)" <public-privacy@w3.org>, Mark Nottingham <mnot@mnot.net>
Message-Id: <582805A9-3804-4760-8B03-0E304F1A868C@w3.org>
As this discussion has broadened past the TAG feedback session, I've changed the subject line and moved to the public-privacy mailing list. (Following up on an old thread that had been lingering in my inbox.)

On May 28, 2015, at 7:04 AM, Melvin Carvalho <melvincarvalho@gmail.com> wrote:
> 
> On 27 May 2015 at 21:38, Nick Doty <npdoty@w3.org <mailto:npdoty@w3.org>> wrote:
> On May 23, 2015, at 8:16 AM, Melvin Carvalho <melvincarvalho@gmail.com <mailto:melvincarvalho@gmail.com>> wrote:
>> As an author of client side apps, one thing I constantly find challenging with is customizing a UI, to a user, in a personalized way.  This is useful both for the app and for the users.  For example from a URI for a user, I can pull in their name, their avatar, their friends list, where there personal storage is, recent conversations, and a bunch of other nice things that can show up in the user interface.
> 
>> 
>> Generally when using an app for the first time, the user will have to type a URI into a form, which identifies themselves, in order to get this personalized user experience.  This is a UX that will lose you the vast majority of your potential user base.
>> 
>> In an ideal world, browsers would be under the complete control of the user, and the user could allow certain websites or apps, to know who they were.  A slightly easier way to do this is to use localStorage, but this suffers from cross origin constraints.  Another way is to use the identity system built in to X.509 client side certificates, which is not cross origin, but this has traditionally had usability issues.
>> 
>> What I've been thinking about lately is allowing a user to persist data about who they are, globally, via fingerprinting.  Then they get a uniform user experience across the web in exchange for a slight loss of privacy, which hopefully will be responsibly managed.
>> 
>> I'd love to know if there is any kind of other solutions for persisting cross origin data about a user (perhaps the upcoming credentials API?).  But if not, I was wondering if maybe fingerprinting could perhaps have some uses for good, e.g. as indirect identifiers, and as a work around to restrictive same origin policies?
> 
> The fingerprinting guidance document under review [1] tries to briefly explain the privacy concerns with regard to browser fingerprinting. Of particular relevance to your suggestion would be that fingerprinting typically happens without user awareness, can't easily be turned on/off and can't be "cleared" the way cookies can. A more friendly way of implementing that functionality would be user-controlled cookies or headers that could send preferences across sites, in a way that users are aware of and can control. For example, though I'm not sure it's widely adopted, users can set their language preferences in a header sent to websites, so that the site can customize the language to the user's preferred language on the very first visit. It's a challenge to explain to users in a way where they can get customized advantages without privacy surprises though. For example, browsers have moved away from allowing cookies to be set and accessed by different origins.
> 
> Section 2.1.1. talks about fingerprinting to identify a user.  This section seems to focus on the negative aspects of identifying a user, where the positive aspects seemingly less covered.  A decade ago the web was largely a read only space, but more recently the web has become a much more personalized experience.  IMHO this is becoming a common expectation.
> 
> My question is: do we have a practical ways to do this *without* fingerprinting.
> 
> Many large scale communications systems grow to incorporate the ability to identify a user to the recipient.  For example, when sending a letter, it is common to write your name on the back.  Telephones often have a 'caller display' feature, email messages tell you who the message is from.
> 
> As a user I would sometimes consent to letting some or all websites know some details about me, often who I am, so that my web experience can be personalized.  You mention sending a header, but I wonder how practical this is?  For some apps, we are working on, the server will send back a 'User:" header after it has established the identity.  But it seems more of a challenge to do this from the client side (tho perhaps I dont know every method available here), so would welcome some guidance.  I have heard of some folks using a browser extension and using the HTTP "From:" header, but this suffers from the weakness that browser extensions are rarely installed, and additionally "From:" can only take an email address, which can be restrictive in a web environment.
> 
> So, I wonder, would it also be possible present what the alternatives could be, if developers and users have genuine motives, and want to avoid fingerprinting?

Would such a list of alternatives have a significant audience? If so, I'd be happy to help provide one, although I think it's orthogonal to the questions of what other W3C Working Groups should be doing regarding browser fingerprinting mitigations in their own specs.

In part, I'm not sure it's necessary because many of these technologies are very well-known and widely-used. For these user-controlled use cases, it's typically going to be much easier to use a known, documented technology rather than a browser fingerprinting script. A very non-exhaustive list:

* cookies: http://tools.ietf.org/html/rfc6265 <http://tools.ietf.org/html/rfc6265>
* authentication/authorization headers: http://tools.ietf.org/html/rfc7235 <http://tools.ietf.org/html/rfc7235>
* HTTP content negotiation: https://developer.mozilla.org/en-US/docs/Web/HTTP/Content_negotiation <https://developer.mozilla.org/en-US/docs/Web/HTTP/Content_negotiation>
* OAuth: http://oauth.net/ <http://oauth.net/>
* OpenID: http://openid.net/ <http://openid.net/>
* Persona/BrowserID: https://login.persona.org/ <https://login.persona.org/>

In general, I think there are very good reasons to maintain the origin model. If I want to share my profile of information (or some subset of it) with a site, I can explicitly choose to authorize that using one of these technologies. Sharing an identifier across origins trivially enables tracking without any opportunity for transparency or control.

Hope this helps,
Nick
Received on Tuesday, 18 August 2015 06:55:19 UTC