- From: イアンフェッティ <ifette@google.com>
- Date: Tue, 3 Mar 2015 14:40:07 -0800
- To: David Singer <singer@apple.com>
- Cc: Joseph Lorenzo Hall <joe@cdt.org>, "public-privacy (W3C mailing list)" <public-privacy@w3.org>
- Message-ID: <CAF4kx8dD2B9s-DBHNv8irOZqv0oTqAE98_EPtCGrcm74y0n9jw@mail.gmail.com>
reply inline 2015-03-03 13:28 GMT-08:00 David Singer <singer@apple.com>: > > > On Mar 3, 2015, at 12:40 , Joseph Lorenzo Hall <joe@cdt.org> wrote: > > > > On Mon, Mar 2, 2015 at 2:22 PM, Ian Fette (イアンフェッティ) <ifette@google.com> > wrote: > >> Do you really want the same ID being sent to all sites? On the one hand > >> we're already spewing IP addresses everywhere and this can be used to do > >> retargeting and/or various data combination across sites, but now if > you've > >> got a stable identifier (over the life of the browsing session, which > could > >> be long) that actually seems like quite a privacy hit to me. > > > > This is a really great point that I don't think we've seen raised yet > > in this discussion. David (Singer): would origin-scoped identifiers > > solve this problem or is the shared persona identifier a feature in > > your opinion? > > Hi, sorry, got behind-hand > > I wondered about it, of course. > > The problems with scoped identifiers are (at least): > a) defining what they are scoped by. ‘The user you think it is from some > other information, if any’ is not very good standards-writing. > Well, it could be origin-scoped :) > b) if it’s scoped by the machine, you can’t carry on searching for your > SO’s birthday present from your phone (on the go) to your laptop (at home) > What else would it be scoped by? If you have a named profile for the user that's not transient, why do you need any of this? Named profiles in Chrome (and other browsers AFAIK) keep separate cookie jars, and I'm not really sure what this buys us over separate cookie jars. Asking e.g. ads servers to keep data separate (even when it's coming from the same IP and fingerprintable data) based on a different "persona" seems like a bit of a DNT-sized task :) Asking other sites to build new infrastructure based on personas seems a lot more complicated than saying "we'll keep the cookies separate for different personas" and letting people carry on. > > I get it that the UUID offers ‘perfect’ identification (well, a claim to > be the same persona of the same person), but that only becomes a problem if > you were trying not to reveal in the first place. > > There is a conflict between asking the network “please respect the > contexts and boundaries of this aspect of me” and “I’m trying to be > anonymous”. Indeed, it seems silly to say them both at the same time. > > > I’m not wedded to UUIDs, of course, if problem (a) can be solved. > > > > >> I've also not really see any notion of "multiple distinct browser > sessions" > >> take off. Incognito / private mode enjoys some nontrivial use, but I'm > still > >> amazed at how few people know it exists. The ability to have multiple > >> distinct profiles exists in Chrome and other browsers, but as much as > we as > >> an industry try to push the notion, I can't say I've ever personally > seen > >> anyone at an airport or cafe (aka not a Google or Apple office) actually > >> using this. I think the UI / change aversion / inertia present harder > >> problems than the technical problem of isolation within a profile. > > > > I've heard grumblings that the notion of sessions altogether are > > getting a bit stale in terms of how people use browser UAs... that's a > > bit depressing to me (I use one browser locked down and then open > > things that need full cookies, JS, etc. in another browser that scrubs > > stuff on close (session end)). But I suspect Ian is very correct that > > making the distinction between nominal/private/persona interaction > > modes to users is going to be very very hard. > > Perhaps. But trivially if you turn on ‘private browsing but not > anonymous/secret’ then minting a new persona for each session is easy. > Browsers could also allow you to open not just a ‘new private window’ but > also ‘a new healthcare window’ or a ‘new birthday-shopping window’ where > you have ‘saved personas’ called ‘healthcare’ and ‘birthday-shopping’. > > For private browsing it's equally easy to start with a clean/empty cookie jar. As for the "new private window vs new healthcare window vs new birthday-shopping window" I guess my question is, if we can't even get users to use separate work and personal profiles today (on the UX side), how do we get them to use an arbitrarily large set of profiles? > > > > > best, Joe > > > >> My $0.02 > >> > >> 2015-02-27 16:28 GMT-08:00 David Singer <singer@apple.com>: > >> > >>> This is basically a mildly-edited re-statement of the ideas, taking > into > >>> account some of the discussion. I was asked to re-post a summary, in > the > >>> discussion this week at the call. > >>> > >>> > >>> * * * * * * * * * * * * > >>> > >>> > >>> The problem: quite a few browsers today have what they call “private > >>> browsing mode” or the like. In this mode all local state that is > >>> accumulated is discarded at the end of the private browsing mode > session > >>> (when the mode is turned off). After turning it off, the local machine > has, > >>> ideally, no trace at all of what was done in the private mode. The > discard > >>> includes browsing history, cookies, local storage etc. I think that > >>> browsers can/do initialize the private session from the user’s current > state > >>> when they start private mode. > >>> > >>> Advantage: if it’s a shared computer, you don’t leave any trace. > >>> > >>> So, private browsing sort-of-looks like this, in terms of state: two > >>> private sessions are started and then ended. These sessions are > initialized > >>> from the base state, which is not updated while the private sessions > are in > >>> process. > >>> > >>> > >>> > >>> +[private 2] - - - > >>> +[private 1] - - - | > >>> | > | > >>> [base state] - - - - - - - - + . . . . . . . . . . . . - - - -+ . . . > . . > >>> . . . . . . .- - - - - - - - - - > >>> Time -> > >>> > >>> This means that private browsing still ‘works’ on the web; cookies > flow, > >>> referer headers, and so on, all as normal. The important aspect of > this is > >>> whether a trace is left on the ‘permanent history’. > >>> > >>> Problem statement: the servers are completely unaware of this mode, > and so > >>> any history etc. THEY keep is still visible. > >>> > >>> Proposal: > >>> > >>> The servers have various means to work out who this is, and attach > history > >>> (these means include cookies, fingerprinting and so on). As noted > above, we > >>> don’t seek to break normal browsing by refusing to accept storage etc. > (e.g. > >>> of cookies), so a simple ‘binary’ signal in an HTTP header “I am > trying to > >>> be private here” doesn’t help, as the server won’t know from request to > >>> request whether this is part of the same session or not. > >>> > >>> Hence, the idea to introduce a header that identifies which ‘private > >>> session’ the user is in. Since, in fact, this can be used for other > purposes > >>> than private browsing, and it’s logically possible for the browser to > have > >>> multiple windows open, or separate sessions, or to return to a private > >>> session, we thought this was essentially an indication of what > ‘aspect’ of > >>> the user that was being presented here, their persona. So, we needed a > >>> session — persona — identifier. Both to make it easy to generate, and > to > >>> make it possible to transfer a private session from one device to > another, > >>> we took the easy route of suggesting that UUIDs are a suitable > >>> identification tool. > >>> > >>> Here is the original suggestion I sent. Note that the server is being > >>> asked to segregate state, not to stop keeping state. This is about the > >>> aspect of privacy which is respecting the right context to ’say’ > something: > >>> ‘why did you say that?’ not ‘why did you know/remember that?’. One of > the > >>> problems with today’s net is not only that servers see and remember > too much > >>> (not addressed here), but they have absolutely no sense of when it’s > >>> appropriate, or not, to reveal what they know (that is addressed here). > >>> > >>> * * * * * > >>> > >>> The user-agent can send an optional HTTP header ‘Persona:’ whose value > is > >>> a suitable machine-generatable distinct identifier (e.g. a UUID). If > the > >>> header is absent, the user is operating under their default (unlabeled) > >>> persona, which is distinct from all the identified personas, which in > turn > >>> are also distinct from each other. A user and their user-agent may > return > >>> to a persona at any time, or continue using a persona for any length of > >>> time. A persona identifier is expected to be universally unique, not > >>> contextualized to the current user-agent or device. > >>> > >>> Servers respecting this are requested to ensure that the labeled > personas > >>> leave no trace or influence on each other or on the unlabeled > persona. For > >>> example, activity under one persona should not affect the ads shown > under a > >>> different persona; any history records that the user can see should be > >>> distinct for each persona; and so on. (It’s OK for your unlabeled > persona to > >>> be reflected in labeled ones, but optional; if servers wish, they can > >>> initialize a named persona from the default, un-named one, when they > first > >>> see it.) > >>> > >>> Server implementers may choose how long they retain records relating to > >>> separate personas, just as they do for today’s default persona. > >>> > >>> This is NOT a request to stop tracking or keeping records; that is an > >>> orthogonal question that is covered by activities such as do-not-track, > >>> cookie directives, and so on. This is about giving users control of > their > >>> privacy by controlling what gets linked to what, and exposed when. > >>> > >>> It may be that it is not particularly necessary or valuable to have a > >>> machine-readable means of discovery over whether servers support this > >>> feature. Any support that they provide is an improvement on today’s > >>> experience, where servers are unaware that users are trying to be > private. > >>> Claims of support for this feature are probably better conveyed in > >>> advertising or other human-readable ways. On the other hand, > >>> machine-readable claims of support have two advantages: the browser can > >>> filter or warn about sites that don’t claim to respect it, and while > not > >>> respecting it probably would not be actionable, claiming to and then > not > >>> doing it would be lying to users, which might be. > >>> > >>> This feature might also be valuable for shared terminals; for example, > in > >>> libraries, airline lounges, internet cafes and the like, a new persona > can > >>> be minted each time the terminal is unlocked for a new session. > Libraries > >>> might tie the persona to the library card, so users returning get > re-linked > >>> to their online history and so on. It might also be a lightweight > >>> replacement of logging-in, for browsers on shared devices — a browser > might > >>> have a simple way of saying which family member it is right now (e.g. a > >>> pull-down menu). > >>> > >>> * * * * > >>> > >>> I think it’s interesting in a number of respects: > >>> > >>> a) it’s an improvement on the status quo, where servers are completely > >>> unaware of any attempt to be private > >>> > >>> b) it’s not asking for *secrecy* at all; servers are at liberty to > >>> remember as much as before; there are very few privacy proposals that > don’t > >>> slide into trying to be secret, and this is one. Privacy is also about > where > >>> information is exposed, what it is linked to, and so on. > >>> > >>> c) it recognizes that privacy is not a binary state — it’s not an > >>> either-or (you have it or you don’t); it’s a spectrum, and it’s about > >>> perception and control and exposure as much as it is about recording > and so > >>> on. > >>> > >>> > >>> * * * * * * * > >>> > >>> What are some of the potential downsides? > >>> > >>> 1) It doesn’t treat servers as adversaries, and if they are, in fact, > >>> ‘hostile’ might be giving them a clue ‘look here, someone is doing > something > >>> under the covers’ > >>> > >>> 2) using a UUID for the persona has advantages — they are not > >>> contextualized by the ‘main’ persona that the server knows or guesses, > and > >>> they can be shared across the user’s devices — but also provides a very > >>> explicit key ‘this is (this aspect of) me’, which again, for > adversarial > >>> servers, might be an issue > >>> > >>> > >>> Note that there is no attempt to claim “this isn’t me, this is someone > >>> else” so linking personas is fine, if the server can work out they are > the > >>> same person (e.g. by cookie or other means). > >>> > >>> > >>> David Singer > >>> Manager, Software Standards, Apple Inc. > >>> > >>> > >> > > > > > > > > -- > > Joseph Lorenzo Hall > > Chief Technologist > > Center for Democracy & Technology > > 1634 I ST NW STE 1100 > > Washington DC 20006-4011 > > (p) 202-407-8825 > > (f) 202-637-0968 > > joe@cdt.org > > PGP: https://josephhall.org/gpg-key > > fingerprint: 3CA2 8D7B 9F6D DBD3 4B10 1607 5F86 6987 40A9 A871 > > David Singer > Manager, Software Standards, Apple Inc. > >
Received on Tuesday, 3 March 2015 22:40:36 UTC