W3C home > Mailing lists > Public > public-webid@w3.org > May 2014

Re: Should WebIDs denote people or accounts?

From: Sandro Hawke <sandro@w3.org>
Date: Mon, 19 May 2014 09:23:18 -0400
Message-ID: <537A05C6.3030205@w3.org>
To: Kingsley Idehen <kidehen@openlinksw.com>, public-webid@w3.org
Long message, to which I don't have time to write a full reply right 
now.   So I'll just respond to a few key points for now.

On 05/19/2014 08:04 AM, Kingsley Idehen wrote:
> On 5/18/14 8:31 PM, Sandro Hawke wrote:
>>> How do you know that two IRIs denote the same thing without an 
>>> owl:sameAs relation? Or without participation in an IFP based 
>>> relation? How do you arrive a such conclusions?
>>> If a WebID doesn't resolve to an Identity Card (or Profile Document) 
>>> comprised of owl:sameAs or IFP based relations, how can you claim 
>>> coreference? You only know that two or more IRIs denote the same 
>>> thing by way of discernible and comprehensible relations.
>> You're putting the burden of proof in the wrong place.
> An Identity Card holds Identity claims.
> Verifying the claims in an Identity Card is handled by an ACL or 
> Policy system. One that's capable of making sense of the claims and 
> then applying them to ACL and Policy tests en route to determining Trust.
> A WebID is like your Passport Number.
> A WebID-Profile is like you Passport.
> The WebID-TLS protocol is a protocol used by the Passport Issuer (this 
> entity has a Trust relationship with Immigration Services).
>> You (and the rest of of the WebID community, including me until about 
>> 5 days ago) model the world in such a way that if your access-control 
>> reasoner ever got hold of some forbidden knowledge (the perfectly 
>> correct fact that two my WebIDs co-refer) it would do the wrong thing.
> Please don't speak for us (OpenLink Software) as I know that simply 
> isn't the case with our ACL engine. You see, you are making the same 
> old mistakes that tend permeate these efforts. As I told you, we 
> actually start our implementations from the point of vulnerability. 
> You get to understand the point of vulnerability when you understand 
> the concepts behind a specification.
> I spent a lot of time drumming home the fact that we needed a 
> conceptual guide for WebID so that we simply wouldn't end up here 
> i.e., you come along and assume everyone has implemented exactly the 
> same thing.
> If we spent more time performing interoperability tests of 
> implementations, others would have also come to realize these issues 
> too, and factor that into their work.
> As far as I know, we are the only one's performing serious WebID-TLS 
> based ACLs testing against ourselves. Thus, you really need to factor 
> that into your implementation assumptions re. ACLs, which for all 
> intents an purposes isn't as far as I know generally standardized etc..

I'm sorry for suggesting that OpenLink's software was in any way 
insecure or poorly designed.    Knowing you and your company I'm 
confident that's not the case.   I was being careless in my argument, 
conflating two different system designs (as explained below).   I'll try 
to be much more careful in the future.

>> That sounds to me like a fundamentally flawed design for an access 
>> control system.
> The ACL system is yet another component distinct from WebID, 
> WebID-Profile Documents, and WebID-TLS. There are not the same thing, 
> they are loosely coupled. You can have many different authentication 
> protocols and ACL systems working with WebIDs. In fact, that's how 
> things will pant out in the future.
>> I don't have to show exactly how it's going to get hold of that data. 
>> Rather, to show the system is reasonably secure, you have to show 
>> it's vanishingly unlikely that the reasoner ever could come across 
>> that data.
> You don't publish what you don't want to be manhandled. The problem is 
> that all the systems today overreach without understanding the 
> implications of said actions.
> I don't hide my Email Address because:
> 1. I sign my emails
> 2. I have sophisticated mail filtering schemes that basically leverage 
> the power of RDF.
>>>> What you're talking about is whether a machine might be able to 
>>>> figure out that truth.
>>> No, I am saying that you determine the truth from the relations that 
>>> represent the claim.
>>>> If I have two different WebIDs that denote me, and you grant access 
>>>> to one of them, it's true a machine might not immediately figure 
>>>> out that that other one also denotes me and should be granted equal 
>>>> access.  But if it ever did, it would be correct in doing so. 
>>> Only if it applied inference and reasoning to specific kinds of 
>>> relations. It can't just jump to such conclusions. You don't do that 
>>> in the real-world so what does it somehow have to be the case in the 
>>> digital realm?
>> It's not out of the question someone might state the same 
>> foaf:homepage for both their WebIDs, or any of a variety of other 
>> true facts.
> Human beings make mistakes. You can't model for eradicating Human 
> mistakes. What you can do is make systems that reduce the probability 
> of said mistakes. Our systems minimize the amount of personally 
> identifiable information that goes into a profile document. We take an 
> ultra conservative approach bearing in mind that folks make mistakes 
> when they don't fully understand the implications of their actions.
>> If they did that, and it resulted in an access violation, I'd point 
>> the finger of blame at the design of the system (using WebIDs to 
>> denote people), not the user who provided that true data.
> A WebID-TLS based authentication service should be able to distinguish 
> between a homepage and a WebID. If it can't do that, then the 
> implementation is at fault, not the WebID, WebID-Profile, WebID-TLS 
> specs.
>>>> And I'm betting, with machines getting access to more and more data 
>>>> all the time, and doing more and more reasoning with it, it would 
>>>> figure that out pretty soon.
>>> Email Address are ample for reconciling coreferences. Thus, if an 
>>> email address in the object of an appropriate relation, then 
>>> coreference can be discerned and applied where relevant etc..
>>>> It sounds like you're proposing building an authorization 
>>>> infrastructure that relies on machines not doing exactly what we're 
>>>> trying to get them to do everywhere else.  Sounds a bit like trying 
>>>> to hold back a river with your hands.
>>> Quite the contrary, I am saying there is a method to all of this, in 
>>> the context of WebID, WebID-Profile, WebID-TLS, and ACLs etc.. This 
>>> items are loosely coupled and nothing we've discussed so far makes a 
>>> defensible case for now catapulting a WebID from an HTTP URI that 
>>> denotes an Agent to one that denotes an Account. We don't have this 
>>> kind of problem at all.
>> You keep saying that, but you haven't explained how we can be assured 
>> that facts stated with regard to one of my WebIDs will never end up 
>> correctly -- but harmfully -- applied to one of my other WebIDs.
> I have, and I repeat:
> 1. owl:sameAs claims are signed by way of reified statements that 
> include relations that incorporate signature
> 2.signed claims by way of incorporation of the multiple WebIDs in 
> Cert. SAN or via inlined claims using data: extension
> 3. not reasoning on owl:sameAs or IFP relations.
> Today, I believe #3 is the norm. We support 1-3 in our products. In 
> addition, we can factor the Cert. Issuer and many other factors into 
> our ACL processing.
> If we hadn't spent all this time on actual ACL testing, you would 
> actually come to realize how we have factored these issues an more 
> into our actual implementation of an RDF based ACL engine that's 
> capable of working with WebID-TLS.
>>>>>> To avoid that undesired fate, I think you need WebIDs to denote 
>>>>>> personas.
>>>>> No, a persona is derived from the claims that coalesce around an 
>>>>> identifier. A persona is a form of identification. A collection of 
>>>>> RDF claims give you a persona.
>>>>>>    As I mentioned, those personas might be software agents, but 
>>>>>> they are clearly not people.
>>>>> WebIDs denote Agents. An Agent could be a Person, Organization, or 
>>>>> Machine (soft or hard). You can make identification oriented 
>>>>> claims in a Profile Document using RDF based on a WebID.
>>>> The question is, what kind of triples are being written with WebIDs,
>>> None.
>>> A WebID is an HTTP URI that denotes an Agent.
>>> Basically,
>>> ## Turtle Start ##
>>> <#WebID>
>>> a <#HttpURI> ;
>>> <#denotes> [ a foaf:Agent ] .
>>> <#HttpURI>
>>> a <#Identifier> .
>>> ## Turtle End ##
>> Personally I don't find this kind of content useful.
> I am just explaining what I understand a WebID to be i.e., an HTTP URI 
> that denotes an Agent.
>> I prefer to keep Turtle for showing the actual data that would be in 
>> a running system.
> No, not in this case, hence the example. I am drilling down to the 
> foundation of the assertion. If we claim that a WebID denotes an 
> Agent, then we can express that fact in Turtle or any other RDF notation.
>> Like the triples which use WebIDs to guide your access control 
>> system. If I added the foaf:homepage triples I mentioned, and your 
>> system did OWL RL (for example) wouldn't it grant access to the wrong 
>> WebID (in addition to the right one)?
> See my earlier comments about reasoning which has always been 
> controlled in our products. An ACL engine can't just infer coreference 
> without any kind of configurable inference modality. That's an exploit 
> that compromises the system, period.

This is the core of the issue, and it may just be a point on which we 
have to agree to disagree.

I think systems should be designed so that giving them more correct 
information will do no harm other than possibly cause performance problems.

I might be able to make a principled argument for this preference of 
mine, but it's kind of a separate issue.

> BTW -- This issue has been raised and discussed over the years re. 
> WebID and WebID-TLS (from the days of FOAF+SSL).
>>>> and what happens when machines figure out all my WebIDs denote me? 
>>> Now, we have a WebID-Profile document which describes what a WebID 
>>> denotes. That document is comprised of claims which may or may no 
>>> indicate co-reference via owl:sameAs and/or IFP based relations 
>>> (e.g., foaf:mbox). None of this means a WebID denotes an Account.
>> I'm not saying it DOES denote an account, just that it SHOULD, in 
>> order to get the persona-separation that people demand.
> The Persona separation is already in place. You don't seem to want to 
> accept that fact. I say that because your claim is only true if we now 
> conflate a WebID (Identifier) and the WebID-Profile (document) 
> combined with all RDF claims as being processed as gospel. An 
> "Account" is another type of thing. A "Persona" is what's discerned 
> from the claims in an Identity Card or Profile Document.
> I can make an ACL rule in our system that decides you are only trusted 
> if your homepage is referenced in a least one blog post or a blog post 
> that is associated with the tag "#WebID" or whatever. The ACL system 
> processes relations expressed using RDF. It doesn't have any 
> hard-coded behavior and it has the option to override certain relation 
> semantics.
> All things being equal, you will see a live online shopping system 
> based on WebID, WebID-Profile, WebID-TLS, and our ACL system. I would 
> be happy to see you break it.
>> It seems clear to me that using WebIDs to denote people is an 
>> actively dangerous and harmful design.
> Using an HTTP URI to denote an Agent is an actively dangerous and 
> harmful design?
>> Either it should be fixed or WebIDs should be scrapped.    Or, of 
>> course, you can show how I'm wrong.
> How can I show you that you are wrong when you don't seem to be 
> willing to make an actual example i.e.., negate an existing WebID-TLS 
> based system by getting access to a protected resource. Would you be 
> ready to try something as practical as that, bearing in mind current 
> ACL systems aren't even supposed to support owl:sameAs and IFP 
> relations based reasoning, by default.
>>> The fact that tools can figure out that an IFP based relation with a 
>>> mailto: scheme URI as it object is a way to triangulate coreference 
>>> still has no bearing on the case for a WebID denoting an Account.
>>>> Are you really being careful with every triple you write using 
>>>> WebIDs to make sure it will still be exactly what you want to say 
>>>> when a reasoner adds more triples just like it using my other WebIDs?
>>> Absolutely!!
>>> Even when dealing with owl:sameAs, we implement a verifier that 
>>> won't even invoke an reasoning and inference if those statements are 
>>> signed by the WebID-Profile document author. Or if those claims 
>>> aren't part of the certificate (e.g., multiple WebIDs in SAN or 
>>> using the Data extension to embed Turtle Notation based RDF claims 
>>> in the certificate).
>>>> It sounds to me like you are not.   It sounds to me like you're 
>>>> just assuming that certain valid inferences will never be made.
>>> Of course not, as per comment above.
>> You're saying the inferences will never be made because the reasoners 
>> will never get hold of the data that would support the conclusion 
>> that both my WebIDs denote the same person?
> I am saying even if it did, subject to modality, it wouldn't necessary 
> perform the inference and reasoning in question (i.e., the claims 
> expressed in the RDF statements it receives). To us good design 
> includes understanding that more often than not, stuff actually goes 
> wrong. I am enthusiastic about open standards but very pessimistic in 
> regards to  actual design and code implementation.
>>   I don't think system should ever be built on assumptions like 
>> that.  It's not just insecure, but it forces us to carefully limit 
>> the flow of information between systems which trust each other and 
>> operate on behalf of the same persona.
> Again, that isn't what I am saying. I am saying: claims that are 
> *logically truthful* aren't *necessarily factual* in the context of an 
> ACL system. 

To my understanding, "logically truthful" and "necessarily factual" are 

What I think you're saying is: some claims that are logically truthful 
will never be present in the ACL system's knowledge base.

> An ACL system operator should be the final arbiter as to what's 
> factual. Thus, owl:sameAs and IFP relations aren't gospel, they too 
> are claims which may or may not have sway in regards to actual Trust 
> determination.

In my view, a system should always "trust" information that it knows to 
be "true".

I'm uncomfortable with a design where a system has to consider some 
triples to be untrusted, even though they are true.  I think that's what 
you're proposing.

I know systems do have to sometimes disregard triples they know to be 
true for performance reasons, but I'd rather keep that to being just 
about performance, not about security.

> I believe in "context fluidity" and "context lenses" above all else. 
> As far as I know, that's how the real-world tends to work too.

Combining information from different contexts is very hard, and one of 
the great things about RDF is you don't have to do it nearly as much.

So, to summarize where I think we are:

* We agree it's important to have the functionality where a person can 
login different ways and get access to different things
* You say those are different WebIDs and different Personas; I say those 
are different Accounts
* You keep them distinct by limiting inference
* I think it's better to keep them distinct by having them be distinct 
resources in the RDF modelling

        -- Sandro

Received on Monday, 19 May 2014 13:23:27 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 19:05:55 UTC