'persona', indicating 'private browsing mode' over the net from David Singer on 2015-02-28 (public-privacy@w3.org from January to March 2015)

From: David Singer <singer@apple.com>
Date: Fri, 27 Feb 2015 16:28:26 -0800
To: "public-privacy (W3C mailing list)" <public-privacy@w3.org>
Message-id: <1612E408-8241-49B2-9BA0-102D8EE2B983@apple.com>
This is basically a mildly-edited re-statement of the ideas, taking into account some of the discussion. I was asked to re-post a summary, in the discussion this week at the call.


* * * * * * * * * * * *


The problem: quite a few browsers today have what they call “private browsing mode” or the like.  In this mode all local state that is accumulated is discarded at the end of the private browsing mode session (when the mode is turned off). After turning it off, the local machine has, ideally, no trace at all of what was done in the private mode. The discard includes browsing history, cookies, local storage etc.  I think that browsers can/do initialize the private session from the user’s current state when they start private mode.

Advantage: if it’s a shared computer, you don’t leave any trace.

So, private browsing sort-of-looks like this, in terms of state: two private sessions are started and then ended. These sessions are initialized from the base state, which is not updated while the private sessions are in process.


                                                                       +[private 2] - - - 
                                     +[private 1] - - -          |
                                     |                                  |
[base state] - - - - - - - - + . . . . . . . . . . . . - - - -+ . . . . . . . . . . . .- - - - - - - - - -
Time ->

This means that private browsing still ‘works’ on the web; cookies flow, referer headers, and so on, all as normal.  The important aspect of this is whether a trace is left on the ‘permanent history’.

Problem statement: the servers are completely unaware of this mode, and so any history etc. THEY keep is still visible.

Proposal:

The servers have various means to work out who this is, and attach history (these means include cookies, fingerprinting and so on). As noted above, we don’t seek to break normal browsing by refusing to accept storage etc. (e.g. of cookies), so a simple ‘binary’ signal in an HTTP header “I am trying to be private here” doesn’t help, as the server won’t know from request to request whether this is part of the same session or not.

Hence, the idea to introduce a header that identifies which ‘private session’ the user is in. Since, in fact, this can be used for other purposes than private browsing, and it’s logically possible for the browser to have multiple windows open, or separate sessions, or to return to a private session, we thought this was essentially an indication of what ‘aspect’ of the user that was being presented here, their persona.  So, we needed a session — persona — identifier.  Both to make it easy to generate, and to make it possible to transfer a private session from one device to another, we took the easy route of suggesting that UUIDs are a suitable identification tool.

Here is the original suggestion I sent.  Note that the server is being asked to segregate state, not to stop keeping state. This is about the aspect of privacy which is respecting the right context to ’say’ something: ‘why did you say that?’ not ‘why did you know/remember that?’. One of the problems with today’s net is not only that servers see and remember too much (not addressed here), but they have absolutely no sense of when it’s appropriate, or not, to reveal what they know (that is addressed here).

* * * * *

The user-agent can send an optional HTTP header ‘Persona:’ whose value is a suitable machine-generatable distinct identifier (e.g. a UUID). If the header is absent, the user is operating under their default (unlabeled) persona, which is distinct from all the identified personas, which in turn are also distinct from each other.  A user and their user-agent may return to a persona at any time, or continue using a persona for any length of time. A persona identifier is expected to be universally unique, not contextualized to the current user-agent or device.

Servers respecting this are requested to ensure that the labeled personas leave no trace or influence on each other or on the unlabeled persona.  For example, activity under one persona should not affect the ads shown under a different persona; any history records that the user can see should be distinct for each persona; and so on. (It’s OK for your unlabeled persona to be reflected in labeled ones, but optional; if servers wish, they can initialize a named persona from the default, un-named one, when they first see it.)

Server implementers may choose how long they retain records relating to separate personas, just as they do for today’s default persona.

This is NOT a request to stop tracking or keeping records; that is an orthogonal question that is covered by activities such as do-not-track, cookie directives, and so on. This is about giving users control of their privacy by controlling what gets linked to what, and exposed when.

It may be that it is not particularly necessary or valuable to have a machine-readable means of discovery over whether servers support this feature.  Any support that they provide is an improvement on today’s experience, where servers are unaware that users are trying to be private. Claims of support for this feature are probably better conveyed in advertising or other human-readable ways. On the other hand, machine-readable claims of support have two advantages: the browser can filter or warn about sites that don’t claim to respect it, and while not respecting it probably would not be actionable, claiming to and then not doing it would be lying to users, which might be.

This feature might also be valuable for shared terminals; for example, in libraries, airline lounges, internet cafes and the like, a new persona can be minted each time the terminal is unlocked for a new session.  Libraries might tie the persona to the library card, so users returning get re-linked to their online history and so on. It might also be a lightweight replacement of logging-in, for browsers on shared devices  — a browser might have a simple way of saying which family member it is right now (e.g. a pull-down menu).

* * * *

I think it’s interesting in a number of respects:

a) it’s an improvement on the status quo, where servers are completely unaware of any attempt to be private

b) it’s not asking for *secrecy* at all; servers are at liberty to remember as much as before; there are very few privacy proposals that don’t slide into trying to be secret, and this is one. Privacy is also about where information is exposed, what it is linked to, and so on.

c) it recognizes that privacy is not a binary state — it’s not an either-or (you have it or you don’t); it’s a spectrum, and it’s about perception and control and exposure as much as it is about recording and so on.


* * * * * * *

What are some of the potential downsides?

1) It doesn’t treat servers as adversaries, and if they are, in fact, ‘hostile’ might be giving them a clue ‘look here, someone is doing something under the covers’

2) using a UUID for the persona has advantages — they are not contextualized by the ‘main’ persona that the server knows or guesses, and they can be shared across the user’s devices — but also provides a very explicit key ‘this is (this aspect of) me’, which again, for adversarial servers, might be an issue


Note that there is no attempt to claim “this isn’t me, this is someone else” so linking personas is fine, if the server can work out they are the same person (e.g. by cookie or other means).


David Singer
Manager, Software Standards, Apple Inc.
Received on Saturday, 28 February 2015 00:28:51 UTC