Re: Named graphs etc from Pat Hayes on 2004-03-16 (www-archive@w3.org from March 2004)

From: Pat Hayes <phayes@ihmc.us>
Date: Mon, 15 Mar 2004 20:06:35 -0600
To: Patrick Stickler <patrick.stickler@nokia.com>
Cc: "ext Chris Bizer" <chris@bizer.de>, <www-archive@w3.org>, "ext Jeremy Carroll" <jjc@hplb.hpl.hp.com>
Message-Id: <p06001f05bc7c0a8fc25b@[10.0.100.76]>
>On Mar 12, 2004, at 18:59, ext Pat Hayes wrote:
>
>>
>>>On Mar 11, 2004, at 19:39, ext Pat Hayes wrote:
>>>
>>>>
>>>>>On Mar 10, 2004, at 15:09, ext Chris Bizer wrote:
>>>>
>>>><snip>
>>>>
>>>>>
>>>>>>>
>>>>>>>Hmmm....  couldn't one view the insertion of graph qualification
>>>>>>>statements specifying assertion and authentication as being
>>>>>>>equivalent to a "speech act", the graph being the utterance?
>>>>>>>
>>>>>>
>>>>>>Also hmmm ... and I think we should forward this question to Pat.
>>>>
>>>>See earlier message on 'web acts'. It does seem odd to me to say 
>>>>that a graph can perform an act such as asserting. Suppose a 
>>>>graph slanders me: can I sue the graph for damages? We have to 
>>>>get genuine agents into the picture somehow.
>>>
>>>No, no, no, Pat.
>>>
>>>The graph is not being positied as a sentient entity.
>>>
>>>Rather, the owner/creator/publisher of the graph uses special
>>>vocabulary within statements in the graph to assert/sign the
>>>graph -- in a way that such qualificatations can be authenticated
>>>to a resonable degree.
>>
>>I fail to see how the use of vocabulary IN a graph can POSSIBLY 
>>constitute a signature or warrant. Anyone can write anything into a 
>>graph.
>
>Yes and no. If the signature includes a checksum of some sort by which
>the contents of the graph can be (to some degree) verified, then it
>becomes harder to create fraudulent graphs -- and those agents/publishers
>which have much to lose from fraud (e.g. banking services) will invest
>more time/effort in checksumming than others.

We are looking at different ends of the arrow. I'm not worried about 
making sure that the reference to the asserted graph is OK. I agree 
that can be checked in various ways. I'm talking about how we check 
the reference to the agent who is supposed to be asserting the graph.

>
>So, a recieving agent can validate/verify a graph to different degrees.
>
>It may simply take the statement about the authority at face value
>and believe it.

That sounds like a VERY poor idea. Think of the mindset of spammers. 
Suppose one could generate, that easily and that rapidly, things that 
looked just like purchase orders to be processed by software, and ask 
yourself how long before there were so many of them that you wouldn't 
be able to find the real purchase orders in the world-wide pile of 
rubbish.

>
>It may take the signature and verify it using some PKI machinery.
>
>It may submit the entire graph to the specified authority for
>verification, which can be checksum based, per a checksum in
>the signature, or even compared against some existing knowledge
>base, etc.
>
>Different agents will choose/need to use different degrees of
>validation. But the qualification machinery (the vocabulary) is
>consistent across all cases.
>
>>  I know your idea is that some kind of correlation between the 
>>'owner' of the graph and a reference in the graph to that owner is 
>>what does the trick: and I agree.
>>
>>Jeremy suggests we allow
>>
>>ex:graphname rdfx:assertedBy ex:thing .
>>
>>I want us to essentially punt on what sort of thing exactly 
>>ex:thing is (its a 'web agent') and on exactly how to tell if a 
>>triple like this is being published by the person it says it is 
>>being published by. I think this is going to be a nasty thing to 
>>get sorted out and might take a long time and we should let others 
>>do it.
>
>Well, I "punt" by simply defining an RDF class trix:Authority
>which vaguely defines a class of "things" which may assert
>graphs.
>
>To what extent members of that class intersects with members of
>other classes can remain an open question -- and for the most
>part, an irrelevant one.
>
>One could think of a trix:Authority as that entity that has
>direct liability for the claims being made. If that entity is
>an agent (human, machine, whatever) acting on behalf of another
>entity, then there are legal mechanisms for transferring
>liability along to the ultimate culprit (i.e. the human manager
>of a web agent, etc.).
>
>But I agree, the less said the better (for now).

OK, just as long as we then do NOT claim that a graph containing this 
vocabulary is thereby automagically authenticated as in any way 
authoritative., just because of the vocabulary it uses.

>
>>
>>We could of course offer the pragmatic advice that if ex:thing is a 
>>graph or a web resource, then the agent is understood to be the 
>>owner of that resource. But this is a work-around, seems to me, 
>>rather than a principled way to handle this issue.  (Can't you just 
>>hear the debates this will produce? People are still arguing about 
>>using homepage URIs to identify people.)
>>
>>I would like us to punt on that aspect of the whole matter, and 
>>just assume that there is some externally-provided way to determine 
>>if the agent doing the publishing is the one referred to in the 
>>graph, which is all that really matters. Having ex:thing be the URI 
>>of the graph or document is one way, but there might be others.
>
>I think it's useful to have at least some vague definition
>of what the object of assertion statements are -- rather than
>being completely silent, as I think we would get as much
>criticism for saying nothing than for saying too much.
>
>At least defining the class trix:Authority gives folks a term
>to use in their debates ;-)
>
>>
>>>By "self-asserting", I simply meant that the bits needed to
>>>determine if a graph is asserted are in the graph.
>>
>>But they aren't: you have to know the URI of the graph as well as 
>>the graph itself, in any case.
>
>True. Point taken.
>
>But no other graph is necessary.
>
>In fact, the authoritative/official URI denoting the graph could be included
>in the encrypted signature -- as an additional means of validating the
>bootstrapping statements.
>
>>>
>>>
>>>Insofar as this latter question is concerned, I don't see one
>>>graph specifying the assertiveness of another graph as practical.
>>
>>Well, I disagree. It is practical because if first-party references 
>>can be made safe, then so can third-part ones; and it seems to me 
>>to be extremely useful as a tool for providing warrants and so on.
>
>Agreed. I misspoke. Sorry.
>
>What I meant was that I failed to see as practical a system
>in which *every* assertion was essentially third-party.

I wasnt intending to propose that. If a graph can assert (by virtue 
of being asserted by a signed agent and saying that it asserts by 
that agent) then it can assert itself or can assert something else, 
either way. The checking arises from the coincidence between the real 
agent and the claimed agency of the assertion. What is asserted can 
be anything.

>
>Without being able to terminate those assertion chains at graphs
>which have within themselves the terminal, bootstrapping statements
>such that the agent need not look to yet another graph to determine
>the assertiveness/authenticity/trustworthiness of that graph

But you can't get that assurance from the graph alone. We  MUST have 
some way to check that the agent is real: otherwise I can publish 
graphs which assert that you assert things that you don't even know 
about. And that's where the termination happens, at the signed 
confirmation of the real agent coinciding with the claimed agent. 
That has nothing to do with the graph being first- or third-person 
relative to what is asserted

>, you
>simply go on forever and ever without ever properly grounding your
>trust model.
>
>That's what I meant.
>
>Yes, absolutely, third party assertions are useful, but not
>sufficient in themselves.

We agree.

>
>>
>>>>>
>>>>>Restraining the boostrapping machinery to each graph prevents
>>>>>folks from speaking on behalf of others.
>>>>
>>>>You don't speak on behalf of others by using their words to make 
>>>>an assertion that they havn't made. If you SAY that they have 
>>>>made an assertion that they havn't in fact made, or if you 
>>>>pretend to be them, then you are lying: and we need to be able to 
>>>>check up on liars and detect the lies quickly and reliably.
>>>
>>>How? If the publisher of a graph says nothing about whether the graph
>>>is asserted or not, how can anyone disagree with me if I say it is?
>>
>>People can say whatever they like. Why should anyone believe them, 
>>is the question. Ultimately, the only firm authority for a claim 
>>that A asserted something is an actual assertion by A. If we can 
>>check an assertion by A to the effect that A asserts a first-person 
>>graph, then we can just as easily, using the same mechanism, check 
>>an assertion by A that A asserts a third-person graph. Asserting 
>>doesn't have to have an implicit 'this graph' in it in order to be 
>>checkable.
>
>True. And my recent counter examples to Chris' reflect this.
>
>The point was that you *have* to have at some point a first-person
>assertion or else your trust model is not grounded and is just
>floating in space with nothing but guesses and uncertainty at
>its periphery.

The grounding comes from a connection between the claimed agent and 
the actual agent of the graph, not from whether the graph asserts 
itself or some other graph. If G is signed by Bill and says that Bill 
asserts H, then whether H is the same as G is irrelevant: Bill 
asserts H. If H = G then that is fine, and if G=/= H that is fine 
also.

>
>>
>>>Having to rely on other (potentially infinite number of) other graphs
>>>to determine the assertiveness of one particular graph seems to
>>>introduce an horrifically inefficient and burdensome bootstrapping
>>>mechanism.
>>
>>Nobody is proposing that. The only way to check whether any graph 
>>is asserted is to confirm who said it. You, the reader who is 
>>trying to figure out who is asserting what,  have to be able to 
>>trace a triple of the form "A asserts..." back to a graph authored 
>>by A (whatever exactly "authored" means). I think we agree on this, 
>>by the way. The only thing that we disagree on it whether or not 
>>those three dots have to refer to the graph that contains that 
>>triple, and I see no good reason for that restriction. It doesn't 
>>provide a graph-ish way to check true assertion unless you can 
>>check graph authorship, in any case.
>
>As I've said elsewhere, ultimately one has to rely on some special
>extra-RDF mechanism to terminate such inter-graph assertion chains.
>
>You say all you have to be able to do is confirm that "A asserts ..."
>but if the only machinery you have are RDF statements and the RDF
>MT, you can *never* get there

Indeed. But we can extend the MT to give you a real place to 
terminate. I thought that was what you wanted me in on the project to 
do :-)

>, because either (a) you have some
>graph which has a statement saying it is asserted, but you can't
>interpret that statement as true because you don't know (yet) that
>the graph is asserted (i.e. a catch 22 situation) or (b) you have
>one graph that has a statement that asserts that another graph is
>asserted, but for the first graph, you have the catch 22 situation;
>thus, having only named graphs, RDF statements, and the RDF MT,
>you can *never* terminate your assertion chains.

Right, I know. But the same point applies even if the graphs refer to 
themselves. Self-reference doesn't stop the regress, it just puts it 
into a loop.

>
>At some point, you *have* to add in some essential bootstrapping
>mechanism by which for a given graph, you can determine if that
>graph is or is not asserted, since the RDF MT won't tell you.

We can extend the MT so that it does tell you, though.

>You can either do that in a fully non-RDF way by using e.g.
>syntactic machinery as part of the document interchange level.
>
>Or you can (as I propose) use some special vocabulary and a
>special test on statements in the graph using that vocabulary.

Unless you extend the meaning (=MT) of those statements, you don't 
get the job done, as you have just explained. Making a syntactic loop 
doesn't create a semantic anchoring in the real world.

>
>The two approches are actually quite similar. Both are really
>"syntactic" tests. But one is based on the serialization syntax
>and the other based on the graph syntax.

I was assuming that the serialization syntax is already grounded in 
something. BUt lets leave that debate alone, I've conceded the 
vocabulary point. Now I want to give this vocabulary a decently 
grounded MT.

>
>>
>>>Restricting the machinery to each specific graph alone, either by
>>>some fancy semantics or by pushing it out to the syntactic layer
>>>(i.e. XML attribute values, etc.) seems the only reasonable approach
>>>to me.
>>>
>>>The moment you have to start chasing chains of bootstrapping statements
>>>in graph after graph to get a final determination regarding one
>>>particular graph, is the moment you loose more folks now unsure about
>>>deploying RDF.
>>>
>>>KISS please!
>>
>>I agree about KISS, but inserting self-referential constructions 
>>which break (put severe strain on) the semantics and have to be 
>>handled by an OWL-incompatible new layer of processing doesn't seem 
>>KISSish to me.
>
>I've have yet to see an example that shows that the "bootstrapping 
>interpretation"
>I propose for authenticating graphs is OWL-incompatible. In fact, I 
>assert that
>it is not. Every statement relevant to that bootstrapping interpretation/test
>remains true and valid per both the RDF and OWL MTs.
>
>It appears that you see dragons that don't exist and which I've never proposed
>to exist.
>
>If you like, please take any of the examples I've provided, and show how OWL
>breaks.

Well it was that layer of preprocessing stuff that seemed 
problematic, for the reasons I suggested. Suppose to take a very 
simple example, you have OWL statements that a class C has 
cardinality one and that ex:thisURI and ex:thatURI are both in it and 
that ex:thisURI is the name of a graph, and that ex:thatURI is 
asserted. It follows that the graph is asserted, but you won't know 
that by inspecting the URIs unless you are very OWL-savvy. Now 
suppose that the graph doesnt have the cardinality info in it but you 
discover it a month later. Now make the reasoning arbitrarily more 
complicated.

>>And in any case, if you introduce the rdfx:assertedBy property, how 
>>are you going to stop people from using it in third-party ways? The 
>>RDF specs essentially say that they can if they want to.
>
>We don't have to. We can remove the cardinality constraint and allow
>it to be used to express third party assertions. No problem. That has
>no impact on the bootstrapping machinery, since such third party
>statements will be in some other graph(s).
>
>>
>>>>>>2. Publishing an unasserted graph on the Web wouldn't be a speech act.
>>>>>
>>>>>It wouldn't necessarily be.
>>>>>
>>>>>If no explicit statement is made within the graph that the graph
>>>>>is asserted, then it is not (necessarily) a speech act.
>>>>>
>>>>>I.e. it is the act of using particular machinery to explicitly
>>>>>indicate that a graph is asserted that constitutes the speech
>>>>>act.
>>>>>
>>>>>If it were explicitly stated in the graph that the graph was
>>>>>*not* asserted, then it would simply be a document (quoted
>>>>>statements).
>>>>>
>>>>>Some agent may, for whatever reason, still wish to treat those
>>>>>statements as asserted, but that is then contrary to the explicitly
>>>>>expressed intended purpose of those statements by the publisher
>>>>>of the graph (e.g. taking it out of context, etc.).
>>>>>
>>>>>Similar to my saying: "The following is false: 'sugar always tastes
>>>>>bitter'" and you treating that as if I has actually asserted
>>>>>that "sugar always tastes bitter".
>>>>
>>>>I bet what will happen is this (whatever we say about it :-) . 
>>>>There will be a way to explicitly non-assert, like quoting; and 
>>>>there will be a way to be absolutely and iron-clad clear about 
>>>>asserting and who is doing the asserting, checkable by secure 
>>>>signatures.  And then there will be cheap-and-cheerful 
>>>>publication which is not marked in any way in particular but is 
>>>>widely accepted for many useful purposes as being asserted as a 
>>>>kind of happy default that enables smart search engines, etc., to 
>>>>get their stuff done when no serious $$ depends on the result. 
>>>>What we need to do is to suggest how to do the former without 
>>>>being so tight-assed that we try to legislate the latter out of 
>>>>existence, because that would be like ordering the tide to stop 
>>>>rising. I think that a way forward is to leave the status quo to 
>>>>do the cheap-and-cheerful, but adding a way to be more secure in 
>>>>the former style when required. To do that I think we need a way 
>>>>to provide an external-to-RDF way to ultimately warrant the 
>>>>checkable assertion forms, since that way of proceeding will 
>>>>almost certainly require tort-law applicable ways of tracing the 
>>>>legal agents who are making the iron-clad assertions (promises, 
>>>>contracts, etc.). And once we have that, there is no harm in 
>>>>allowing 'external' assertion of content, and quite a lot of 
>>>>utility in allowing it.
>>>
>>>
>>>Um. Er. That is *precisely* what I have been proposing...
>>
>>Well, OK, good. But if there is an external-to-RDF way of doing the 
>>warranting of assertion, then I don't see the need for a syntactic 
>>criterion like self-reference in a graph to signal the assertion. 
>>You still need to check the warrant: anyone can forge a 
>>self-reference; so the graphical 'sign' of assertion isn't any use 
>>anyway.
>
>I look at it like this: we have some key information that needs to be
>provided about the graph -- whether it's asserted, its authority, a
>verifiable signature, etc.
>
>Most of that information is useful after the verification process to
>RDF/OWL applications, particularly those maintaining graph membership
>information of statements, so if we have to put that information
>somewhere, let's put it in the graph itself -- especially since we
>can then use legacy RDF/OWL tools.
>
>Interpretation/testing of this special information is going to require
>some extra semantics/operations not defined by RDF or OWL, no matter
>where that information is stored.
>
>Defining the interpretation/testing of that special information,
>expressed as statements in the graph, need not intersect nor impact
>the RDF or OWL MTs.

The issue is how to STOP it being involved with those MTs. I don't 
see how that would be possible.

But OK, lets stop quarreling and agree that we need to do an MT job 
on this stuff. I'll try to do one, OK?

Pat
-- 
---------------------------------------------------------------------
IHMC	(850)434 8903 or (650)494 3973   home
40 South Alcaniz St.	(850)202 4416   office
Pensacola			(850)202 4440   fax
FL 32501			(850)291 0667    cell
phayes@ihmc.us       http://www.ihmc.us/users/phayes
Received on Monday, 15 March 2004 21:06:44 UTC