Re: Named graphs etc from Patrick Stickler on 2004-03-15 (www-archive@w3.org from March 2004)

From: Patrick Stickler <patrick.stickler@nokia.com>
Date: Mon, 15 Mar 2004 10:54:57 +0200
To: "ext Pat Hayes" <phayes@ihmc.us>
Cc: "ext Chris Bizer" <chris@bizer.de>, <www-archive@w3.org>, "ext Jeremy Carroll" <jjc@hplb.hpl.hp.com>
Message-Id: <6F30C5E7-765E-11D8-A56E-000A95EAFCEA@nokia.com>
On Mar 12, 2004, at 18:59, ext Pat Hayes wrote:

>
>> On Mar 11, 2004, at 19:39, ext Pat Hayes wrote:
>>
>>>
>>>> On Mar 10, 2004, at 15:09, ext Chris Bizer wrote:
>>>
>>> <snip>
>>>
>>>>
>>>>>>
>>>>>> Hmmm....  couldn't one view the insertion of graph qualification
>>>>>> statements specifying assertion and authentication as being
>>>>>> equivalent to a "speech act", the graph being the utterance?
>>>>>>
>>>>>
>>>>> Also hmmm ... and I think we should forward this question to Pat.
>>>
>>> See earlier message on 'web acts'. It does seem odd to me to say 
>>> that a graph can perform an act such as asserting. Suppose a graph 
>>> slanders me: can I sue the graph for damages? We have to get genuine 
>>> agents into the picture somehow.
>>
>> No, no, no, Pat.
>>
>> The graph is not being positied as a sentient entity.
>>
>> Rather, the owner/creator/publisher of the graph uses special
>> vocabulary within statements in the graph to assert/sign the
>> graph -- in a way that such qualificatations can be authenticated
>> to a resonable degree.
>
> I fail to see how the use of vocabulary IN a graph can POSSIBLY 
> constitute a signature or warrant. Anyone can write anything into a 
> graph.

Yes and no. If the signature includes a checksum of some sort by which
the contents of the graph can be (to some degree) verified, then it
becomes harder to create fraudulent graphs -- and those 
agents/publishers
which have much to lose from fraud (e.g. banking services) will invest
more time/effort in checksumming than others.

So, a recieving agent can validate/verify a graph to different degrees.

It may simply take the statement about the authority at face value
and believe it.

It may take the signature and verify it using some PKI machinery.

It may submit the entire graph to the specified authority for
verification, which can be checksum based, per a checksum in
the signature, or even compared against some existing knowledge
base, etc.

Different agents will choose/need to use different degrees of
validation. But the qualification machinery (the vocabulary) is
consistent across all cases.


>  I know your idea is that some kind of correlation between the 'owner' 
> of the graph and a reference in the graph to that owner is what does 
> the trick: and I agree.
>
> Jeremy suggests we allow
>
> ex:graphname rdfx:assertedBy ex:thing .
>
> I want us to essentially punt on what sort of thing exactly ex:thing 
> is (its a 'web agent') and on exactly how to tell if a triple like 
> this is being published by the person it says it is being published 
> by. I think this is going to be a nasty thing to get sorted out and 
> might take a long time and we should let others do it.

Well, I "punt" by simply defining an RDF class trix:Authority
which vaguely defines a class of "things" which may assert
graphs.

To what extent members of that class intersects with members of
other classes can remain an open question -- and for the most
part, an irrelevant one.

One could think of a trix:Authority as that entity that has
direct liability for the claims being made. If that entity is
an agent (human, machine, whatever) acting on behalf of another
entity, then there are legal mechanisms for transferring
liability along to the ultimate culprit (i.e. the human manager
of a web agent, etc.).

But I agree, the less said the better (for now).

>
> We could of course offer the pragmatic advice that if ex:thing is a 
> graph or a web resource, then the agent is understood to be the owner 
> of that resource. But this is a work-around, seems to me, rather than 
> a principled way to handle this issue.  (Can't you just hear the 
> debates this will produce? People are still arguing about using 
> homepage URIs to identify people.)
>
> I would like us to punt on that aspect of the whole matter, and just 
> assume that there is some externally-provided way to determine if the 
> agent doing the publishing is the one referred to in the graph, which 
> is all that really matters. Having ex:thing be the URI of the graph or 
> document is one way, but there might be others.

I think it's useful to have at least some vague definition
of what the object of assertion statements are -- rather than
being completely silent, as I think we would get as much
criticism for saying nothing than for saying too much.

At least defining the class trix:Authority gives folks a term
to use in their debates ;-)

>
>> By "self-asserting", I simply meant that the bits needed to
>> determine if a graph is asserted are in the graph.
>
> But they aren't: you have to know the URI of the graph as well as the 
> graph itself, in any case.

True. Point taken.

But no other graph is necessary.

In fact, the authoritative/official URI denoting the graph could be 
included
in the encrypted signature -- as an additional means of validating the
bootstrapping statements.

>>
>>
>> Insofar as this latter question is concerned, I don't see one
>> graph specifying the assertiveness of another graph as practical.
>
> Well, I disagree. It is practical because if first-party references 
> can be made safe, then so can third-part ones; and it seems to me to 
> be extremely useful as a tool for providing warrants and so on.

Agreed. I misspoke. Sorry.

What I meant was that I failed to see as practical a system
in which *every* assertion was essentially third-party.

Without being able to terminate those assertion chains at graphs
which have within themselves the terminal, bootstrapping statements
such that the agent need not look to yet another graph to determine
the assertiveness/authenticity/trustworthiness of that graph, you
simply go on forever and ever without ever properly grounding your
trust model.

That's what I meant.

Yes, absolutely, third party assertions are useful, but not
sufficient in themselves.

>
>>>>
>>>> Restraining the boostrapping machinery to each graph prevents
>>>> folks from speaking on behalf of others.
>>>
>>> You don't speak on behalf of others by using their words to make an 
>>> assertion that they havn't made. If you SAY that they have made an 
>>> assertion that they havn't in fact made, or if you pretend to be 
>>> them, then you are lying: and we need to be able to check up on 
>>> liars and detect the lies quickly and reliably.
>>
>> How? If the publisher of a graph says nothing about whether the graph
>> is asserted or not, how can anyone disagree with me if I say it is?
>
> People can say whatever they like. Why should anyone believe them, is 
> the question. Ultimately, the only firm authority for a claim that A 
> asserted something is an actual assertion by A. If we can check an 
> assertion by A to the effect that A asserts a first-person graph, then 
> we can just as easily, using the same mechanism, check an assertion by 
> A that A asserts a third-person graph. Asserting doesn't have to have 
> an implicit 'this graph' in it in order to be checkable.

True. And my recent counter examples to Chris' reflect this.

The point was that you *have* to have at some point a first-person
assertion or else your trust model is not grounded and is just
floating in space with nothing but guesses and uncertainty at
its periphery.

>
>> Having to rely on other (potentially infinite number of) other graphs
>> to determine the assertiveness of one particular graph seems to
>> introduce an horrifically inefficient and burdensome bootstrapping
>> mechanism.
>
> Nobody is proposing that. The only way to check whether any graph is 
> asserted is to confirm who said it. You, the reader who is trying to 
> figure out who is asserting what,  have to be able to trace a triple 
> of the form "A asserts..." back to a graph authored by A (whatever 
> exactly "authored" means). I think we agree on this, by the way. The 
> only thing that we disagree on it whether or not those three dots have 
> to refer to the graph that contains that triple, and I see no good 
> reason for that restriction. It doesn't provide a graph-ish way to 
> check true assertion unless you can check graph authorship, in any 
> case.

As I've said elsewhere, ultimately one has to rely on some special
extra-RDF mechanism to terminate such inter-graph assertion chains.

You say all you have to be able to do is confirm that "A asserts ..."
but if the only machinery you have are RDF statements and the RDF
MT, you can *never* get there, because either (a) you have some
graph which has a statement saying it is asserted, but you can't
interpret that statement as true because you don't know (yet) that
the graph is asserted (i.e. a catch 22 situation) or (b) you have
one graph that has a statement that asserts that another graph is
asserted, but for the first graph, you have the catch 22 situation;
thus, having only named graphs, RDF statements, and the RDF MT,
you can *never* terminate your assertion chains.

At some point, you *have* to add in some essential bootstrapping
mechanism by which for a given graph, you can determine if that
graph is or is not asserted, since the RDF MT won't tell you.

You can either do that in a fully non-RDF way by using e.g.
syntactic machinery as part of the document interchange level.

Or you can (as I propose) use some special vocabulary and a
special test on statements in the graph using that vocabulary.

The two approches are actually quite similar. Both are really
"syntactic" tests. But one is based on the serialization syntax
and the other based on the graph syntax.

>
>> Restricting the machinery to each specific graph alone, either by
>> some fancy semantics or by pushing it out to the syntactic layer
>> (i.e. XML attribute values, etc.) seems the only reasonable approach
>> to me.
>>
>> The moment you have to start chasing chains of bootstrapping 
>> statements
>> in graph after graph to get a final determination regarding one
>> particular graph, is the moment you loose more folks now unsure about
>> deploying RDF.
>>
>> KISS please!
>
> I agree about KISS, but inserting self-referential constructions which 
> break (put severe strain on) the semantics and have to be handled by 
> an OWL-incompatible new layer of processing doesn't seem KISSish to 
> me.

I've have yet to see an example that shows that the "bootstrapping 
interpretation"
I propose for authenticating graphs is OWL-incompatible. In fact, I 
assert that
it is not. Every statement relevant to that bootstrapping 
interpretation/test
remains true and valid per both the RDF and OWL MTs.

It appears that you see dragons that don't exist and which I've never 
proposed
to exist.

If you like, please take any of the examples I've provided, and show 
how OWL
breaks.

> And in any case, if you introduce the rdfx:assertedBy property, how 
> are you going to stop people from using it in third-party ways? The 
> RDF specs essentially say that they can if they want to.

We don't have to. We can remove the cardinality constraint and allow
it to be used to express third party assertions. No problem. That has
no impact on the bootstrapping machinery, since such third party
statements will be in some other graph(s).

>
>>>>> 2. Publishing an unasserted graph on the Web wouldn't be a speech 
>>>>> act.
>>>>
>>>> It wouldn't necessarily be.
>>>>
>>>> If no explicit statement is made within the graph that the graph
>>>> is asserted, then it is not (necessarily) a speech act.
>>>>
>>>> I.e. it is the act of using particular machinery to explicitly
>>>> indicate that a graph is asserted that constitutes the speech
>>>> act.
>>>>
>>>> If it were explicitly stated in the graph that the graph was
>>>> *not* asserted, then it would simply be a document (quoted
>>>> statements).
>>>>
>>>> Some agent may, for whatever reason, still wish to treat those
>>>> statements as asserted, but that is then contrary to the explicitly
>>>> expressed intended purpose of those statements by the publisher
>>>> of the graph (e.g. taking it out of context, etc.).
>>>>
>>>> Similar to my saying: "The following is false: 'sugar always tastes
>>>> bitter'" and you treating that as if I has actually asserted
>>>> that "sugar always tastes bitter".
>>>
>>> I bet what will happen is this (whatever we say about it :-) . There 
>>> will be a way to explicitly non-assert, like quoting; and there will 
>>> be a way to be absolutely and iron-clad clear about asserting and 
>>> who is doing the asserting, checkable by secure signatures.  And 
>>> then there will be cheap-and-cheerful publication which is not 
>>> marked in any way in particular but is widely accepted for many 
>>> useful purposes as being asserted as a kind of happy default that 
>>> enables smart search engines, etc., to get their stuff done when no 
>>> serious $$ depends on the result. What we need to do is to suggest 
>>> how to do the former without being so tight-assed that we try to 
>>> legislate the latter out of existence, because that would be like 
>>> ordering the tide to stop rising. I think that a way forward is to 
>>> leave the status quo to do the cheap-and-cheerful, but adding a way 
>>> to be more secure in the former style when required. To do that I 
>>> think we need a way to provide an external-to-RDF way to ultimately 
>>> warrant the checkable assertion forms, since that way of proceeding 
>>> will almost certainly require tort-law applicable ways of tracing 
>>> the legal agents who are making the iron-clad assertions (promises, 
>>> contracts, etc.). And once we have that, there is no harm in 
>>> allowing 'external' assertion of content, and quite a lot of utility 
>>> in allowing it.
>>
>>
>> Um. Er. That is *precisely* what I have been proposing...
>
> Well, OK, good. But if there is an external-to-RDF way of doing the 
> warranting of assertion, then I don't see the need for a syntactic 
> criterion like self-reference in a graph to signal the assertion. You 
> still need to check the warrant: anyone can forge a self-reference; so 
> the graphical 'sign' of assertion isn't any use anyway.

I look at it like this: we have some key information that needs to be
provided about the graph -- whether it's asserted, its authority, a
verifiable signature, etc.

Most of that information is useful after the verification process to
RDF/OWL applications, particularly those maintaining graph membership
information of statements, so if we have to put that information
somewhere, let's put it in the graph itself -- especially since we
can then use legacy RDF/OWL tools.

Interpretation/testing of this special information is going to require
some extra semantics/operations not defined by RDF or OWL, no matter
where that information is stored.

Defining the interpretation/testing of that special information,
expressed as statements in the graph, need not intersect nor impact
the RDF or OWL MTs.

Warranting the graph, and the assertions contained therein, will be
an extra-RDF operation based on that special information, and doesn't
matter where that information is expressed, so let it be in the
graph itself.

>
>> If no explicit statements are made about assertion, for legacy reasons
>> or out of sheer laziness, presume it is asserted (but keep at arms 
>> length,
>> per se, insofar as liability/accountability is concerned).
>>
>> If there is an explicit statement that the graph is asserted, it's 
>> asserted.
>
> Well, we have to say, if it is *entailed* that it is asserted, then 
> its asserted.  But that's OK, and might be useful ("I hereby assert 
> all the ontologies in this class of ontologies:...." )
>
> The only place we differ is that you want this to be restricted to 
> assertions of a graph within that graph.

No. I propose no such restriction. Third party assertions are OK. As 
long as
we have the machinery to ground our model in first party 
assertions/authentication.

So, presumably then, we don't differ on any key point.

> I see no good reason for this restriction, and if we abandon it then 
> (1) we don't need to torture the semantics (2) a number of (seems to 
> me) useful techniques are enabled, such as securely asserting content 
> currently expressed in vanilla RDF, and allowing secure assertion 
> behind firewalls to refer to content in the public domain (3) nothing 
> in the named-graph model breaks.
>
>> If there is an explicit statement that the graph is not asserted, 
>> it's not asserted.
>
> Again, say entails; but yes.
>
>>
>> Seems like that is a pretty cheap-and-cheerful approach, yet one that
>> provides the key bits for authentication/trust/accountabililty for
>> those who care...
>
> Right. (We agree?? What happened????)

We succeeded in communicating through a misunderstanding where my push 
for
a bootstrapping mechanism for first party assertion/authentication was
understood as an opposition against third party assertion; whereas, I'm
fine with both; but felt the former was being missed entirely in the
examples reflecting third party assertion.

;-)

>
>>
>>>
>>>>
>>>> --
>>>>
>>>> In essence, a graph explicitly marked as unasserted is a quoting
>>>> mechanism.
>>>
>>> Right. We could maybe do that, as Mark suggests, just by using the 
>>> XML media type (though that seems a bit kludgey to me.)
>>
>> Yuck. And what about all the other serializations/expressions of RDF 
>> graphs? So we'd
>> end up with double the number of MIME types... no thanks.
>
> I agree. Just thought Id mention it to see if it flew.

Crash and burn ;-)

>
>>>> A graph with no explicit marking for assertion is simply an
>>>> ambiguous "utterance" which may or may not reflect the asserted
>>>> beliefs of the source.
>>>
>>> but which is good enough for many purposes, right? Particularly if 
>>> 99% of the RDF on the planet is deployed this way with the general 
>>> understanding that it is intended to be read and used.
>>
>> Right. See above. Again, I'm talking about what is needed as a 
>> foundation
>> for authentication/trust/accountability, not whether any arbitrary set
>> of statements might or might not be useful for some application.
>>
>>>
>>>>
>>>> A graph with explicit marking for assertion equates to an
>>>> utterance within a context of "I assert/swear/affirm/believe/...
>>>> that ...", which can serve as the basis for accountability
>>>> (and of course, trust).
>>>
>>> Right, but only if the thing doing the swearing and affirming is 
>>> something a bit more solid than a graph. It has to be the kind of 
>>> thing that a court can put a restraining order on, or you or I can 
>>> sue.
>>
>> It *is*. That thing doing the assertion is the entity associated with 
>> a
>> signature or identified by a URI as the explicit authority, etc. 
>> I.e., if
>> I am asserting a graph, that signature and URI will ascribe those 
>> statements
>> to me.
>
> Is that really true? That signatures can identify legal agents? I'd 
> like to get an opinion on that from someone in law enforcement before 
> committing to it. But my proposal is that we punt on this issue.

Well, we could say that digital signatures are in may ways more reliable
than real signatures, though admittedly, both can be contested.

We can punt on the issue insofar as we need not mandate what kind of 
signature
is used nor how reliable the authentication machinery used has to be -- 
only
that there are certain bits that one has to have to achieve a minimal 
solution,
and some realizations of some of those bits may be better in some 
implementations
than others -- but the bits will be there nonetheless in all 
implementations of
the model, and their function/significance/relationships will be the 
same in
all cases.

Cheers,

Patrick

--

Patrick Stickler
Nokia, Finland
patrick.stickler@nokia.com
Received on Monday, 15 March 2004 03:57:09 UTC