Re: Browser UI & privacy - a discussion with Ben Laurie from Henry Story on 2012-10-05 (public-webid@w3.org from October 2012)

From: Henry Story <henry.story@bblfish.net>
Date: Fri, 5 Oct 2012 18:37:40 +0200
To: Harry Halpin <hhalpin@w3.org>
Cc: Hannes Tschofenig <hannes.tschofenig@gmx.net>, Carvalho Melvin <melvincarvalho@gmail.com>, public-identity@w3.org, public-philoweb@w3.org, Ben Laurie <benl@google.com>, "public-webid@w3.org" <public-webid@w3.org>
Message-Id: <1CA29471-56AF-4BDD-8BF2-07920885870E@bblfish.net>
On 5 Oct 2012, at 17:31, Harry Halpin <hhalpin@w3.org> wrote:

> Henry,
> 
> Sorry for top-posting but I think you missed my rather friendly point. 

ok, so I assume you agree with my replies then.

> 
> Please look up "unlinkability" (which is why I kept referencing the aforementioned IETF doc, which I saw referenced earlier but       whose main point seemed missed). Then explain how WebID provides unlinkability.

The fact that there is a draft spec which uses the word unlinkabilty as a title for section on security problem does not mean that it is right to do so or that it has anything to do with the linkability of WebID.

Let us look at the definition

 Definition:  Unlinkability of two or more Items Of Interest (e.g.,
      subjects, messages, actions, ...) from an attacker's perspective
      means that within a particular set of information, the attacker
      cannot distinguish whether these IOIs are related or not (with a
      high enough degree of probability to be useful).

Who is the attacker here?

> Looking at the spec - to me, WebID doesn't as it still requires publishing your public key at a URI and then having the relying party go to your identity provider (i.e. your personal homepage in most cases, i.e. what it is that hosts your key) in order to verify your cert, which must provide that URI in the SAN in the cert.

yes, notice that BrowserId which you praise - now called Mozilla Persona - puts an e-mail address in the equivalent of the SAN field.

> Thus,  WebID does not provide unlinkability. There's some waving of hands about guards and access control, but that would not mediate the above point, as the HTTP GET to the URI for the key is enough to provide the "link". 

So who is the attacker here? My Personal web site, because it might know where I logged in? Or the site I am authenticating to, because it may know who I am when I am trying to login to it? 
> 
> In comparison, BrowserID provides better privacy in terms of unlinkability by having the browser in between the identity provider and the relying party, so the relying party doesn't have to ping the identity provider for identity-related transactions. That definitely helps provide unlinkability in terms of the identity provider not needing to knowing every time the user goes to a relying party. 

Ah yes, so you are providing security of me from my home site which you are thinking of as the attacker. Weird.

In any case there are a few answers to this for WebID:

  A. a relying party does not have to fetch the WebID profile every time (web docs have times to live)
  B. a relying party could fetch that page anonymously
  C. It would not be problematic to enhance WebID so that a certificate be verified by checking the signature of the issuer

   Let us look at C in more detail.

    One simple answer would be to sign the certificates with the private key of the web server that serves the profiles. Then just by connecting to that server and before even making the GET request, the server could know that the certificate was signed by the right server ). This may require changing all the certificates of every user if the server cert got compromised, but relying parties could then just fall back to WebID for the transition time.
 
> It would be interesting to see how WebID community takes the linkability requirement on board. 

  If that is the feature that would bring you over then we can add it.  In fact it falls right out of usage of TLS.

  We'll be at TPAC so we can discuss these issues there in more detail 
      http://www.w3.org/2012/10/TPAC/
   but that can also be brought up on the mailing list.

> Again, Henry, I think there's some good ideas in WebID re use of stronger credentials, as I mentioned earlier. I read the spec, provided comments, etc.  pointing this all out earlier as well. Please take comments on board and fix the spec or tweak the idea.

So I hear this requirement from you. But how would I know that the above solution is good enough?

> BTW, I don't think JS Web Crypto has anything to do with this problem in its current form 

well web crypto in the browser is essential for BrowserId to be able to be distributed, otherwise it can only be centralised relying on a mozilla lookup. 

> as are references to Frege (yes, you can conceive of multiple personae as intensions with the same extension, but I don't think thats technically relevant). I do think privacy is an important concern that cannot be dismissed by philosophically redefining privacy, although obviously there is important work here :)
> 
>    thanks,
>        harry
> 
> On 10/05/2012 04:22 PM, Henry Story wrote:
>> 
>> On 5 Oct 2012, at 15:14, Harry Halpin <hhalpin@w3.org> wrote:
>> 
>>> Thanks for bringing my thesis up.
>>> 
>>> However, I might add that the inability to support any degree of privacy/anonymity/multiple identities/unlink-ability due to a dogmatic idea over "linking" re URIs re server-to-server connections (See BrowserID for a nice solution to this) and lack of a               user-interface is one of the reasons why I doubt WebID in its current form can succeed in the market.
>> 
>> Your statements Harry are completely unfounded, and clearly show that you are feeling quite threatened
>> by WebID. I know your work on the javascript cryptography group can be thought to be in the same ballpark,
>> but I think they are complimentary, so take a breath and try to calm down. Don't take technology so personally :-)
>> 
>> The best thing to get over your fears is to start by defending your claims above a little bit, so that I 
>> can help answer them. Let us cut your statements up a little bit, because they are packed
>> with misunderstandings, have no foundation in fact that I know of, and are even self-contradictory.
>> You say:
>> 
>> 1) WebID cannot provide any degree of privacy/anonymity/mutliple identities/unlinkab-ility
>>  
>>    Here you have to take your pick and defend your position a little bit. You cannot have all of 
>> them and defend BrowserId (see 3 below). Please refer to spec, or to some real thing we can discuss. 
>> 
>> 2) Dogmatic idea over linking
>> 
>>     If you read my exchange with Ben Laurie, I was arguing that WebID provides an important option in the spectrum of identity
>> solutions, from anonymous, to cookie based authentication, .... WebID does provide a global Identity, the user can choose and
>> control. So here it is not that much different from OpenID. Do they also have a dogmatic idea over linking?
>> 
>> 3) BrowserId is a nice solution
>> 
>>     BrowserId is a solution that is very close to WebID in many ways. See the comparison I put up on Stack Exchange:
>>    http://security.stackexchange.com/questions/5406/what-are-the-main-advantages-and-disadvantages-of-webid-compared-to-browserid
>>    It uses e-mail identifiers which are URIs. So there whatever problem you attribute to WebID it will have too. Furthermore it could use URL based identifiers like WebID too. The disadvantage of BrowserId is therefore clear: it can be used to spam you.
>> 
>> At present, and until the crypto in the browser work you are working on - BrowserID is a centralised solution. 
>> With crypto in the browser it is still a much heavier solution than TLS, because of the well known web principle of simplicity [ http://www.w3.org/DesignIssues/Principles.html ] : You are trying to replace TLS with a Turing Complete language called JavaScript. With TLS we know exactly what the  complexity of the code is, and how long it will take to evaluate. In your solution you want the user to download javascript code run it, in order to get the same effect and a "better" UI, where better is in the eyes of the web site holder, not of the user of the web site. Furthermore unless you really do the UI well - and what is the chance of browser vendors doing that if  they cannot do the more fundamental transparency first -  this solution will be extremely prone to phishing.
>> 
>>   In any case for many applications that are non browser based, eg: robots sending information to other robots using RESTful communication (GET, PUT, POST + Linked Data)  the javascript authentication is way more complicated than is necessary.
>> 
>> 4. Lack of user interface
>> 
>>     The user interface  issues are no worse than with cookie based authentication currently. If you look at the video at 
>> http://webid.info/ you will see that most browsers get it at least pretty good as far as selecting the certificate goes. The problem is just as with cookies, in letting the user know he is logged in. This transparency seems to be a requirement in EU law, at least according to one interpretation by a legal scholar who we were discussing in this thread. 
>> 
>> The issues that remain in the browser don't  invalidate the technology. Otherwise what point would  there be in doing anything at the W3C? You are mixing levels of technology here Harry. 
>> 
>>> I think lots of people have expressed this problem  and the WebID community has never modified their spec to enable these use-cases, and thus WebID is only appropriate to people who want to use RDF,
>> 
>> Ah you really would like a non RDF solution.
>> 
>> Well we have proposed a GRDDL solution, but we are waiting for people who are interested in working on that.
>> 
>> Here you can see it for yourself, we have a section waiting for people from other formats to join:
>>     http://www.w3.org/2005/Incubator/webid/spec/#in-portable-contacts-format-using-grddl
>> 
>> 
>>> don't mind the "self-signed cert" user interface,
>> 
>> I think you are confusing the user interface for when you go to a website with Self Signed Certs
>> from the potentially self signed cert of a client certificate. WebID does not make a requirement for
>> web servers to run with self signed certificates. They can use CA signed certificates, as most of us
>> in fact do at present. We can't wait forthe IETF Dane protocol ( RFC6698 ) to get deployed 
>> in browsers as that will indeed be an opportunity to make it much easier for web servers to have 
>> self signed certs that don't show up an ugly error message. 
>> 
>> 
>> But again there is no damage to client certificates having self signed certificates.
>> You can try this out by making yourself a certifiate at
>> 
>>   http://my-profile.eu/
>> 
>> And logging in I don't know to the foafssl server
>> 
>>   https://foafssl.org/srv/idp?rs=http%3A%2F%2Fbblfish.net%2F
>> 
>> 
>>> and want their public info on a web-page to link all their "identities" together.
>> 
>> What nonsense! Who told you that we want to link all our identities together? 
>> Where in the spec does it say so? Here is a short url for you to look it up:
>>  
>>      http://webid.info/spec/
>> 
>> All you need to make public is your *public* key. You already made that public by accessing
>> the web site with cryptography. All the rest of the information an be protected using the
>> well known HTTP access control. In fact the diagram in the spec clearly shows how 
>> one can protect information
>> 
>>    http://www.w3.org/2005/Incubator/webid/spec/img/WebIdGraph.jpg
>> 
>> 
>>> That is some group of people, I agree, but it's far from a magic bullet solution to identity.
>> 
>> There is no silver bullet in technology. But you are in no position to judge anything on WebID
>> for the moment, given that you don't seem to have done anything more than listen to rumours
>> of what it is about.
>> 
>>> 
>>>  I highly doubt bringing up philosophy will actually help here unless you can clarify what you mean re privacy, anonymity, multiple identities. There was some work by the IETF in this direction that seemed going in the right directions: 
>>> 
>>> https://tools.ietf.org/html/draft-hansen-privacy-terminology-03
>> 
>> Yes, we were just discussing that. Perhaps if you had bothered reading some of the answers in this
>> thread before blurting out a response you'd know about it. See
>> 
>>    http://lists.w3.org/Archives/Public/public-philoweb/2012Oct/0016.html
>> 
>>> 
>>> I also think this discussion should be confined to its proper mailing list.  For example, if it simply becomes FOAF+SSL folks championing the wonders of RDF, then perhaps the discussion should remove other mailing lists than WebID.
>> 
>> We are discussing on the Identity mailing lists, I think that is quite appropriate. 
>> 
>>> If its a philosophical discussion, then I'd keep it on philoweb. Or an identity discussion that's not dogmatic, keep on public-identity. This is basic etiquette.
>> 
>> I think you have written one mail to the PhiloWeb mailing list since it's start. We are discussing issues
>> of reference, privacy, and a number of other issues that are relevant to the group. You are clearly
>> feeling a bit unsettled today, as otherwise you'd have noticed that we were discussing Frege, and
>> intensionality and extensionality in relation to the web, ( see below in the message you quoted) and
>> furthermore in relation to draft-hansen-privacy-terminology-03 which you suggested we read.
>> 
>> 
>> 
>>> 
>>>    cheers,
>>>        harry
>>> 
>>> 
>>> On 10/04/2012 09:24 PM, Henry Story wrote:
>>>> [resent as the image was too big and so stripped from the mailing
>>>>  list, making one part of the text incomprehensible ]
>>>> 
>>>> On 4 Oct 2012, at 17:10, Hannes Tschofenig <hannes.tschofenig@gmx.net> wrote:
>>>> 
>>>>> Hi Melvin, 
>>>>> 
>>>>> On Oct 4, 2012, at 4:49 PM, Melvin Carvalho wrote:
>>>>> 
>>>>>> I think the aim is to have an identity system that is universal.  The web is predicated on the principle that an identifier in one system (eg a browser) will be portable to any other system (eg a search engine) and vice versa.  The same principle applied to identity would allow things to scale globally.  This has, for example, the benefit of allowing users to take their data, or reputation footprint when them across the web.  I think there is a focus on WebID because it is the only identity system to date (although yadis/openid 1.0 came close) that easily allows this.  I think many would be happy to use another system if it was global like WebID, rather than another limited context silo.
>>>>> 
>>>>> I think there is a lot of confusion about the difference between identifier and identity. You also seem to confuse them. 
>>>>> 
>>>>> Here is the difference: 
>>>>> 
>>>>>   $ Identifier:   A data object that represents a specific identity of
>>>>>      a protocol entity or individual.  See [RFC4949].
>>>>> 
>>>>> Example: a NAI is an identifier 
>>>>> 
>>>>>   $ Identity:   Any subset of an individual's attributes that
>>>>>      identifies the individual within a given context.  Individuals
>>>>>      usually have multiple identities for use in different contexts.
>>>>> 
>>>>> Example: the stuff you have at your Facebook account
>>>> 
>>>> This is a well know distinction in philosopohy. You can refer to things in two ways:
>>>>  - with names ( identifiers ) 
>>>>  - with existential variables ( anonymous names if you want ), and attaching a description to that
>>>>    thing that identifies it uniquely among all other things
>>>> 
>>>> So for example Bertrand Russell considered that "The Present King of France" in "The Present King of France is Bald" was
>>>> not acting like a proper name, but as an existential variable with a definite description. That is in 
>>>> mathematical logic he translated that phrase to:
>>>> 
>>>>    ∃x[PKoF(x) & ∀y[PKoF(y) → y=x] & B(x)]
>>>> 
>>>> See http://en.wikipedia.org/wiki/Definite_description
>>>> Harry Halpin goes into this in this Philosophy of the Web Thesis
>>>>   http://journal.webscience.org/324/
>>>> http://www.ibiblio.org/hhalpin/homepage/thesis/
>>>> 
>>>> So yes we know this, and understand this very well. The Semantic Web is an outgrowth of 
>>>> Fregean logic, tied to the Web through URIs, and with some of the best logicians 
>>>> in the world  having worked on its design. This is our bread and butter.
>>>> 
>>>> In fact in WebID we are using this to our advantage. What we do is we use 
>>>> a URI - a universal identifier - to identify a person, in such a way that it is
>>>> tied to a definite description as "the agent ID that knows the private key of public
>>>> key Key".
>>>> 
>>>> [ image available at:
>>>>   http://www.w3.org/wiki/images/4/49/X509-Sense-and-Reference.jpg ]
>>>> 
>>>> <Mail Attachment.gif>
>>>> 
>>>> The text in the document named "http://bblfish.net/" says:
>>>> 
>>>> <#hjs> foaf:name "Henry Story";
>>>>             cert:key [ a cert:RsaPublicKey; cert:modulus ... ; cert:exponent ... ]
>>>> 
>>>> 
>>>> So in the above the Identifier is "http://bblfish.net/#hjs" which referes to <http://bblfish.net/#hjs> 
>>>> (me) which you can recognise as the knower of the private key
>>>> published on the http://bblfish.net/ web page (in RDFa, in this case)
>>>> 
>>>>> 
>>>>> To illustrate the impact for protocols let me try to explain this with OpenID Connect. 
>>>>> 
>>>>> OpenID Connect currently uses SWD (Simple Web Discovery) to use a number of identifiers to discover the identity provider, see http://openid.net/specs/openid-connect-discovery-1_0.html 
>>>>> 
>>>>> The identifier will also have a role when the resource owner authenticates to the identity provider. The identifier may also be shared with the relying party for authorization decisions. 
>>>>> 
>>>>> Then, there is the question of how you extract attributes from the identity provider and to make them available to the relying party.
>>>> 
>>>> In WebID that is easy for public info: you use HTTP GET.
>>>> Otherwise you put protected info into protected resources, link to them from the WebID profile, 
>>>> and apply WebID recursively to the people requesting information about that resource. Ie: you
>>>> protect the resources containing information that needs protecting.
>>>> 
>>>> This makes it possible to describe people and their relations extremely richly,
>>>> and it allows one to be very fine grained in who one allows access to information.
>>>> 
>>>> 
>>>>> There, very few standards exist (this is the step that follows OAuth). The reason for the lack of standards is not that it isn't possible to standardize these protocols but there are just too many applications. A social network is different from a system that uploads data from a smart meter. Facebook, for example, uses their social graph and other services use their own proprietary "APIs" as well. 
>>>> 
>>>> Yes, I know people keep saying its impossible, and then we have trouble showing them - 
>>>> since the impossible cannot be seen.
>>>> 
>>>> Btw in WebID we use
>>>> 
>>>> The one well know api: HTTP.
>>>> A semantic/logic model: RDF and mappings from syntax to that model - which
>>>> is based on Relations which I think Bertrand Russel showed to be pretty much all you needed.
>>>> 
>>>> Then it is a question of working together and developing vocabularies that metastabilise.
>>>> (More on that in a future video). 
>>>> 
>>>>> 
>>>>> This is the identity issue. 
>>>>> 
>>>>> You are mixing all these topics together. This makes it quite difficult to figure out what currently deployed systems do not provide. 
>>>>> 
>>>>> Ciao
>>>>> Hannes
>>>>> 
>>>> 
>>>> Social Web Architect
>>>> http://bblfish.net/
>>>> 
>>> 
>> 
>> Social Web Architect
>> http://bblfish.net/
>> 
> 

Social Web Architect
http://bblfish.net/
Attachments

application/pkcs7-signature attachment: smime.p7s
Received on Friday, 5 October 2012 16:38:15 UTC