Re: Liking Linkability from Melvin Carvalho on 2012-10-09 (public-identity@w3.org from October 2012)

From: Melvin Carvalho <melvincarvalho@gmail.com>
Date: Tue, 9 Oct 2012 17:10:52 +0200
To: Henry Story <henry.story@bblfish.net>
Cc: "public-webid@w3.org" <public-webid@w3.org>, "public-identity@w3.org" <public-identity@w3.org>, public-privacy@w3.org, "public-philoweb@w3.org" <public-philoweb@w3.org>
Message-ID: <CAKaEYh+x+xrR3z7pRz-7CYeXiUmPir30FZqq3kmDAGj73ZcXMg@mail.gmail.com>
On 6 October 2012 15:49, Henry Story <henry.story@bblfish.net> wrote:

>
> Notions of unlinkability of identities have recently been deployed
> in ways that I would like to argue, are often much too simplistic,
> and in fact harmful to wider issues of privacy on the web.
>

It seems to me that there's 3 phases of the web

1. Unlinkability -- this was essentially web 1.0 and provided anonymity

2. Pseudo anonymitiy -- this was essentially web 2.0 and provided user
logins but also lead to walled gardens and data silos

3. Linkability -- perhaps this the great unsolved problem of web 3.0 and
will provide data portability


>
> I would like to show this in two stages:
>  1. That linkability of identity is essential to electronic privacy
>     on the web
>  2. Show an example of an argument by Harry Halpin relating to
>  linkability, and by pulling it apart show how careful one has
>  to be with taking such arguments at face value
>
> Because privacy is the context in which the linkability or non linkability
> of identities is important, I would like to start with a simple working
> definition of what constitutes privacy with the following minimal
> criterion [0] that I think everyone can agree on:
>
> "A communication between two people is private if the only people
>  who are party to the conversation are the two people in question.
>  One can easily generalise to groups: a conversation between groups
>  of people is private (to the group) if the only people who can
>  participate/read the information are members of that group"
>
> Note that this does not deal with issues of people who were privy to
> the conversation later leaking information voluntarily. We cannot
> technically legislate good behaviour, though we can make it possible
> for people to express context. [1]
>
>
> 1. On the importance of linkability of identities to privacy
> ============================================================
>
> A. Issues of Centralisation
> ---------------------------
>
> We can put this with the following thought experiment which I put
> to Ben Laurie recently [0].
>
> First imagine that we all are on one big social network, where
> all of our home pages are at the same URL. Nobody could link
> to our profile page in any meaningful way. The bigger the network
> the more different people that one URL could refer to. People
> that were part of the network could log in, and once logged in
> communicate with others in their unlinkable channels.
>
> But this would not necessarily give users of the network privacy:
> simply because the network owner would be party to the conversation
> between any two people or any group of people. Conversations
> that do not wish the network owner to be party to the conversation
> cannot work within that framework.
>
> At the level of our planet it is clear that there will always be a
> huge number of agents that cannot for legal or other reasons allow one
> global network owner to be party to all their conversations. We are
> therefore socio-logically forced into the social web.
>
> B. Linkability and the Social Web
> ---------------------------------
>
> Secondly imagine that we now all have Freedom Boxes [4], where
> each of us has full control over the box, its software, and the
> data on it. (We take this extreme individualistic case to emphasise
> the contrast, not because we don't acknowledge the importance of
> many intermediate cases as useful) Now we want to create a
> distributed social network - the social web - where each of us can
> publish information and through access control rules limit who can
> access each resource. We would like to limit access to groups such
> as:
>
>   - friends
>   - friends of friends
>   - family
>   - business colleagues
>   - ...
>
>  Limit access means, that we need to determine when accessing a
> resource who is accessing it. For this we need a global identifier
> so that can check with the information available to us, if the
> referent of that identifier is indeed a member of one of those
> groups. We can't have a local identifier, for that would require
> that the person we were dealing with had an account on our private
> box - which will be extremely unlikely. We therefore need a way
> to identify - pseudonymously if be - agents in a global space.
>
> Take the following example. Imagine you come to the WebID TPAC
> meeting [6] and I take a picture of everyone present. I would like
> to first restrict access to the picture to only those members who
> were present. Clearly if I only used local identifiers, I would have
> to get each one of you to first create an account on my machine. But
> how would I then know that the accounts created on the FBox correspond
> to the people who were at the party? It is much easier if we could
> create a party members group and publish it like this
>
>    http://www.w3.org/2005/Incubator/webid/team.n3
>
> Then I could drag and drop this group on the access control panel
> of my FBox admin console to restrict access to only those members.
> This shows how through linkability I can restrict access and
> increase privacy by making it possible to link identities in a distributed
> web. It would be quite possible furthermore for the above team.n3
> resource to be protected by access control.
>
>
> 2. Example of how Unlinkability can be used to spread FUD
> =========================================================
>
>
> So here I would like to show how fears about linkability can
> then bring intelligent people like Harry Halpin to make some seemingly
> plausible arguments. Here is an example [2] of Harry arguing against
> W3C WebID CG's http://webid.info/spec/
>
> [[
>  Please look up "unlinkability" (which is why I kept referencing the
>  aforementioned IETF doc [sic [3] below it is a draft] which I saw
>  referenced earlier but whose main point seemed missed). Then explain
>  how WebID provides unlinkability.
>
>  Looking at the spec - to me, WebID doesn't as it still requires
>  publishing your public key at a URI and then having the relying party go
>  to your identity provider (i.e. your personal homepage in most cases,
>  i.e. what it is that hosts your key) in order to verify your cert, which
>  must provide that URI in the SAN in the cert. Thus,  WebID does not
>  provide unlinkability. There's some waving of hands about guards and
>  access control, but that would not mediate the above point, as the HTTP
>  GET to the URI for the key is enough to provide the "link".
>
>  In comparison, BrowserID provides better privacy in terms of
>  unlinkability by having the browser in between the identity provider and
>  the relying party, so the relying party doesn't have to ping the
>  identity provider for identity-related transactions. That definitely
>  helps provide unlinkability in terms of the identity provider not
>  needing to knowing every time the user goes to a relying party.
> ]]
>
> If I can rephrase the point seems to be the following: A WebID verification
> requires that the site your are authenticating to ( The Relying Party )
> verify
> your identity by dereferencing ( let me add: anonymously ) your profile
> page, which might only contain as much as your public key publicly. The
> yellow
> box in the picture here:
>
>   http://www.w3.org/2005/Incubator/webid/spec/#the-webid-protocol
>
> The leakage of information then would not be towards the Relying Party -
> the
> site you are logging into - because that site is the one you just wilfully
> sent a proof of your identity to. The leakage of information is (drum roll)
> towards your profile page server! That server might discover ( through IP
> address
> sniffing  presumably ) which sites you might be visiting.
>
> One reasonable answer to this problem would be for the Relying Party to
> fetch
> this information via Tor which would remove the ip address sniffing
> problem.
>
> But let us develop the picture of who we are loosing (potentially)
> information to. There are a number of profile server scenarios:
>
> A. Profile on My Freedom Box [4]
>
>   The FreedomBox is a personal machine that I control, running
> free software that I can inspect. Here the only person who has
> access to the Freedom Box is me. So if I discover that I logged
> in somewhere that should come as no surprise to me. I might even
> be interested in this information as a way of gathering information
> about where I logged in - and perhaps also if anything had been
> logging in somewhere AS me. (Sadly it looks like it might be
> difficult to get much good information there as things stand
> currently with WebID.)
>
> B. Profile on My Company/University Profile Server
>
> As a member of a company, I am part of a larger agency, namely the
> Company or University who is backing my identity as member of that
> institution. A profile on a University web site can mean a lot more
> than a profile on some social network, because it is in part backed
> by that institution. Of course as a member of that institution we
> are part of a larger agent hood. And so it is not clear that the
> institution
> and me are in that context that different. This is also why it is
> often legally required that one not use one's company identity for
> private business.
>
> C. A Social Network ( Google+, Facebook, ... )
>
>   It is a bit odd that people who are part of these networks, and who
> are "liking" pretty much everything on the web in a way that is clearly
> visible and is encouraged by those networks to be visible to the
> network, would have an issue with those sites knowing-perhaps (if the
> RP does not use Tor or a proxy) where they are logging into. It is
> certainly
> not the way the OAuth, OpenID or other protocols that are in extremely
> wide use now have been developed and are used by those sites.
>
> If we look then at BrowserId [7] Now Mozilla Persona, the only difference
> really with WebID ( apart from it not being decentralised until crypto in
> the
> browser really works ) is that the certificate is updated at short notice
> - once a day - and that relying parties verify the signature. Neither of
> course
> can the relying party get much interesting attributes this way, and if it
> did
> then the whole of the unlinkability argument would collapse immediately.
>
>
> 3. Conclusion
> =============
>
> Talking about privacy is like talking about security. It is a breeding
> ground
> for paranoia, which tend to make it difficult to notice important
> solutions to the problem we actually have. Linkability or unlinkability as
> defined in
> draft-hansen-privacy-terminology-03 [3] come with complicated definitions,
> and are I suppose meant to be applied carefully. But the choice of
> "unlinkable"
> as a word tends to help create rhethorical short cuts that are apt to hide
> the
> real problems of privacy. By trying too hard to make things unlinkable we
> are moving
> inevitably towards a centralised world where all data is in big brother's
> hands.
>
> I want to argue that we should all *Like* Linkability. We should
> do it  aware that we can protect ourselves with access control (and TOR)
> and realise that we don't need to reveal anything more than anyone knew
> before hand in our linkable profiles.
>
> To create a Social Web we need a Linkable ( and likeable ) social web.
> We may need other technologies for running Wikileaks type set ups, but
> the clearly cannot be the basic for an architecture of privacy - even
> if it is an important element in the political landscape.
>
> Henry
>
> [0] this is from a discussion with Ben Laurie
>
> http://lists.w3.org/Archives/Public/public-webid/2012Oct/att-0022/privacy-def-1.pdf
> [1] Oshani's Usage Restriction paper
>     http://dig.csail.mit.edu/2011/Papers/IEEE-Policy-httpa/paper.pdf
> [2] http://lists.w3.org/Archives/Public/public-identity/2012Oct/0036.html
> [3] https://tools.ietf.org/html/draft-hansen-privacy-terminology-03
> [4] http://www.youtube.com/watch?v=SzW25QTVWsE
> [6] http://www.w3.org/2012/10/TPAC/
> [7] A Comparison between BrowserId and WebId
>
> http://security.stackexchange.com/questions/5406/what-are-the-main-advantages-and-disadvantages-of-webid-compared-to-browserid
>
>
> Social Web Architect
> http://bblfish.net/
>
>
Received on Tuesday, 9 October 2012 15:11:30 UTC