Re: Proposed draft RDF Graph vocabulary from Pat Hayes on 2004-03-16 (www-archive@w3.org from March 2004)

From: Pat Hayes <phayes@ihmc.us>
Date: Tue, 16 Mar 2004 17:53:24 -0600
To: Patrick Stickler <patrick.stickler@nokia.com>
Cc: <www-archive@w3.org>, "ext Jeremy Carroll" <jjc@hplb.hpl.hp.com>, <chris@bizer.de>
Message-Id: <p06001f10bc7d3b03c212@[10.0.100.76]>
>On Mar 16, 2004, at 03:31, ext Pat Hayes wrote:
>
>>>>>     rdfg:Authority
>>>>>        a rdfs:Class ;
>>>>>        rdfs:comment "An authority, or origin, of a graph." .
>>>>>
>>>>>We don't say more about what an rdfg:Authority actually is. We only
>>>>>(vaguely)
>>>>>define that such a class of entities exist which have a particular role
>>>>>with regards to trust.
>>>>
>>>>"For example, a person or company."
>>>
>>>Sounds reasonable.
>>
>>BUt we need to be able to refer to the authority associated with a 
>>particular graph, in the MT.
>
>See the rdfg:authority and rdfg:assertedBy properties, which associate
>a member of the class rdfg:Authority with a graph.

That is all just RDF formalism. The MT is supposed to be what says 
what this formal stuff means.

>
>>formally we can say that the interpretation is relative to an 
>>'authority mapping' which interprets URIs denoting authorities to 
>>actual authorities, or some such trick. But I need real authorities 
>>in the MT somewhere.
>
>The 'real' authorities would be whatever are denoted by the URIs. So given
>
>    some:graph rdfg:authority some:authority .
>
>then whatever is denoted by some:authority is the actual authority of the
>graph denoted by some:graph.
>
>Right?

Wrong. I'll try to explain this again :-). Nothing is simply denoted 
by any RDF graph, or by any URI because of an RDF graph. Denotation 
is always relative to an interpretation. In one interpretation, it 
denotes one thing, in another interpretation, another. As far as the 
MT is concerned, all URIs could denote ascii strings.

Now, we can specify in the semantic conditions that some URI must 
denote some actual things (agents, documents, whatever) in *any* 
legal interpretation (because we say so, just like we insist that 
typed literals denote datatype values): but to do this, we need some 
way to refer to those real things that is given outside of the RDF 
formalism itself (otherwise we are just being circular). So no amount 
of RDF vocabulary is going to provide this semantic anchoring, since 
until we get it anchored its just meaningless formalism.

>>
>
>>>>Isn't a signature at the end of the day a byte-sequence, so having a
>>>>property
>>>>rdfg:signatureBytes with domain rdfg:Signature and range
>>>>xsd:hexEncodedByteSequence (I need to look up the correct name).
>>>
>>>I was actually considering defining rdfg:Signature as an rdfs:Datatype,
>>>but decided that would just distract too many folks...
>>>
>>>But particular subtypes of rdfg:Signature could certainly be modeled
>>>as datatypes.
>>>
>>>It could, though, be modelled as e.g. a structured object denoted
>>>by a bnode with various characteristics defined by properties.
>>
>>It can still be in the datatype class, though. (I'm siding with 
>>Jeremy here, but I think I agree with Patrick:)
>
>Actually, I think we're all in agreement here as to the optimal nature
>of signature values in practice -- I simply chose to make the vocabulary
>agnostic for the sake of genericity, since it is not critical to the
>key points of our presentation.
>
>It doesn't matter really how the signature is expressed, only that
>it serves it function. The class rdfg:Signature only aims to reflect
>that function, not any particular form. Particular forms can be
>mandated for instances of particular subclasses of rdfg:Signature,
>including datatypes.
>
>>>>Save space and delete one -
>>>
>>>It seems to me that choosing only one leaves a gap in the
>>>expressivity -- like having ex:wife but not ex:husband, etc.
>>>It's mostly an issue of esthetics for me.
>>
>>For the record, OWL faced similar choices and decided to go with 
>>one in each case as a design decision. Users are free to define the 
>>inverses if they want them.
>>
>>>
>>>>I slightly prefer subGraphOf to subsumes - it is
>>>>clearer what one is talking about.
>>>
>>>I agree. But I didn't really like rdfg:superGraphOf ;-)
>>>
>>>If we go with only one such property, then we can use rdfg:subGraphOf.
>>>If we stick with both, then the rdfg:subsumes/subsumedBy seems more
>>>balanced.
>>>
>>>Pat, Chris, any preferences.
>>
>>OK, I'm lost. I have no idea what you guys are trying to do here, 
>>or what the point of all this vocabulary is supposed to be.
>>
>>Take subgraph. Whether or not graph A is a subgraph of graph B is 
>>just a fact: it either is or it isn't, and you can check the actual 
>>graphs to find out.  Now, what exactly is added by being able to 
>>say this in RDF? Suppose that A is in fact not a subgraph of B but 
>>you assert
>>
>>A rdfg:subGraphOf B .
>>
>>This is false.  Should it count as an inconsistency, or is it just 
>>plain false? Is it an inconsistency relative to a graph naming 
>>(like a D-inconsistency)? That would make a kind of sense...
>
>>
>>Similarly signatures. Suppose a graph lies about its real 
>>signature: it is signed by ex:joe, in fact, but it asserts that it 
>>is signed by ex:bill. Is this inconsistent? If so, the semantics 
>>has to somehow refer to the actual signature. (How?). If not, the 
>>semantics hasn't got to do anything, and all this vocabulary is 
>>just user vocabulary (so why are we worrying about it?).
>
>I don't think we're introducing any new problems here.

Not new problems, but we are asking our old tools to do new kinds of jobs.

>Claims/statements about graphs or authorities or signatures are just
>claims, and they may be false, or lead to disagreements or contradictions,
>but testing of those claims or resolution of disagreements or contradictions
>will always be an extra-RDF activity.

Well, but if the claims are about the RDF itself (as in reification) 
or about the process of publishing/asserting RDF, then one can 
reasonably expect that the RDF specs say something a little more 
about how the description relates to the reality described. I thought 
that was what we would be trying to do. IN cases like subGraph, we 
are talking about the syntax of RDF 'expressions' in RDF, so there is 
a special way to be 'wrong' which is syntax-checkable. I think we 
need to face up to our responsibilities to say more about this kind 
of a case.

>
>To test if A rdfg:subGraphOf B is true or not requires comparing both
>graphs -- which is (I believe) not defined by the RDF MT as an explicit
>test by which one determines consistency.

Not by the bare MT, no. But I wonder if in this case it OUGHT to be, 
in an appropriately extended RDFG MT. Think of how we extended the MT 
for datatypes, and imagine doing that kind of a number on the graphs. 
It is do-able, don't get me wrong.

>To test if ex:bill is or is not the valid authority for an asserted/published
>graph will involve PKI machinery or the like, which clearly is an extra-RDF
>process.
>
>To test if ex:sky ex:is ex:blue will require some extra-RDF determination.
>
>Again, I fail to see how we are introducing any new problems.

Distinction. Finding out whether something is true is one thing. 
Making the question, is it true? be well-defined, is another. Do we 
want the question, "does this graph assert anything?" to be even 
well-defined? Does the semantics explain what it means, in some 
sense? It can do that without providing a means to test it. Or it can 
even punt on this issue, and just say 'asserted' is a property and I 
have no idea what it means (which is what it says right now, in 
effect). I'd be disappointed if we did that.

>
>>OK, that is all just semantics. But what is the *utility* of being 
>>able to say this stuff in RDF? No amount of RDF vocabulary is going 
>>to actually get any asserting done or provide any signature 
>>security all by itself.  This seems to me to be like deciding how 
>>to talk about locks and keys, but not having any actual locks and 
>>keys.
>
>We are allowing folks to make claims in RDF about those locks and keys

Only if we specify that the vocabulary actually talks about real 
locks and keys, not just things that are called 'lock' and 'key' in 
the vocabulary (which might be pieces of cheese and small blue pills 
respectively.) The bare MT doesn't provide any attachment of a 
vocabulary to real things. We need to add that if we want it done.

>-- and
>those claims are then interpreted per the RDF MT, even if the testing of those
>claims require extra-RDF machinery (as outlined above).
>
>It is useful to be able to reason about those claims of authority, signatures,
>trust, etc. perhaps prior to employing any extra-RDF tests -- e.g. 
>to determine
>whether we even care to apply any tests, or which test might be the best
>to try first.
>
>It's putting as much of the trust/authentication machinery into RDF space

Unless we anchor the vocabulary, its not putting any machinery into 
RDF space. Consider the Herbrand lemma: if an RDF graph has any 
interpretation, then it has an interpretation over a universe of 
string literals. There aren't any locks or keys in a Herbrand 
universe.

>so that we can use as much of the RDF machinery as we can, which also
>reduces the amount of additional extra-RDF machinery that is needed to
>achieve a complete solution.
>
>Some of this is put into RDF because it's useful to do RDF-like things
>to it. Some of this is put into RDF simply to avoid having to put it
>elsewhere (e.g. in the XML syntax), even if one is likely never to
>do RDF-like things with it.
>
>>
>>>>>     rdfg:authority
>>>>>        a rdf:Property ;
>>>>>        rdfs:comment "The object is the authority, or origin, of the
>>>>>subject graph." ;
>>>>>        rdfs:domain rdfg:Graph ;
>>>>>        rdfs:range rdfg:Authority .
>>>>>
>>>>>This property simply associates an authority with a particular graph.
>>>>>It does
>>>>>not assert anything. This can be used to clearly indicate the origin of
>>>>>the
>>>>>graph without that origin/entity making any actual claims (e.g. the end
>>>>>result being akin to quoting, if not otherwise asserted by that
>>>>>authority
>>>>>elsewhere).
>>>>
>>>>Not yet convinced here.
>>>
>>>I would like for there to be a way to publish a graph, and even
>>>sign it with an explicit authority, yet not actually assert the
>>>graph
>>
>>I agree. Publication and (real, secure) assertion ought to be orthogonal.
>>
>>>  -- or perhaps leave it open to conditional assertion using
>>>a special vocabulary, per my recent "ErrorState" example.
>>>
>>>And if you don't have an authority specified, you have no idea
>>>with/for whom to authenticate the signature.
>>>
>>>Our three pieces to the puzzle are:
>>>
>>>1. Authority
>>>2. Signature
>>>3. Assertiveness
>>>
>>>The first is provided by rdfq:authority, either directly
>>>or via rdfg:assertedBy per the rdfs:subPropertyOf closure.
>>>
>>>The second is provided by rdfq:signature.
>>>
>>>The third is IMO optional, and provided by rdfq:assertedBy.
>>
>>None of these can be PROVIDED by any RDF. RDF can only DESCRIBE 
>>things. I agree about the distinctions, but if we are going to 
>>actually use RDF to DO this stuff, then we need to semantically 
>>anchor the special vocabulary in something real, like real 
>>signatures, real agents who can assert things, and so on.
>
>Let me rephrase. The statement (potential claim) about assertiveness
>is provided by the rdfq:assertedBy property.
>
>Whether the graph is or is not considered as actually asserted
>by some agent depends on that special "bootstrapping interpretation"
>which is, agreed, outside the scope of the RDF MT.

Well, OK. BUt then I really have NOTHING to do here (which may be a 
relief to all concerned :-). You don't need anything done to the MT: 
you are just inventing a user vocabulary and saying things in English 
about what its supposed to mean. Which is OK, let me hasten to add, 
but I thought we were supposed to be doing something more than that 
(Like, suggest a way to bootstrap trust and provenance.)

>
>>>>
>>>>>
>>>>>Statements using any of the above vocabulary are fully valid and
>>>>>compatible with both the RDF and OWL MTs irrespective of the
>>>>>special bootstrapping interpretation/test necessary for determination
>>>>>of (terminal) assertion and authentication.
>>>>>
>>>>
>>>>Yes bootstrapping is not very special.
>>>
>>>It's just a means to terminate the trust/assertion chains expressed
>>>in the various graphs since the RDF MT can't do that for us.
>>
>>I think that only a suitable MT can possibly do that for you. But 
>>it has to be able to talk about the real stuff.
>
>For any graph where
>
>    ?graph ( ?graph rdfg:assertedBy ?authority . )
>
>then the graph denoted by ?graph is to be taken as asserted,
>by the authority denoted by ?authority.
>
>That, in a nutshell, is the bootstrapping interpretation for
>assertiveness.

That isnt good enough, though, seems to me. All the semantic weight 
here is on the ?authority being actually the real authority, and we 
don't say what that means (never mind how to check it: we ought to at 
least explain what it is supposed to mean.) . On the other hand, the 
self-reference of the assertion plays no useful role: it doesnt help 
with authentication, and theres no semantic reason for it:

?graph1 (?graph2 assertedBy ?authority)

is just as easy to check, is more general, more useful and uses 
exactly the same semantics.

>
>It's extra-RDF in that, the RDF MT doesn't view the above triple
>as a triple if the graph is not asserted

I really don't see how to make this coherent. A graph is a set of 
triples, and there is that triple in it. Is it in the set or not? If 
not, what's it doing in the syntax?

>  -- but the triple is
>what is asserting the graph.

No, the ?authority is asserting the graph. It has to be the 
authority: if a graph gets itself asserted just by being 
self-referential then I can write an assertion spammer.

>  Hence the appropriateness of the
>term "bootstrapping".
>
>The special MT needs to capture how the graph "asserts itself
>by its bootstraps".

I don't want us to propose any way for any graph to assert itself 
without reference to some asserting agent. Its the agent that has to 
do the asserting, not the graph.

>
>This is distinct from third-party assertion where one has
>a graph that is deemed (by whatever means) to already be
>asserted, and which has a statement which asserts another
>graph.
>
>The bootstrapping assertion of one graph an then result
>in the chain-assertion of several other graphs which
>otherwise would never be taken as asserted due to the
>absence of any bootstrapping statements.
>
>Authentication is similar:
>
>For any graph where
>
>    ?graph ( ?graph rdfg:authority ?authority .
>             ?graph rdfg:signature ?signature . )
>
>(where rdfg:authority may be inferred from rdfg:assertedBy)
>
>then the graph denoted by ?graph is to be taken as authentic,
>if the specified signature denoted by ?signature is valid
>for the authority denoted by ?authority, per some extra-RDF
>validation test.
>
>Again, this requires an extra-RDF interpretation since the graph
>in question may not be otherwise taken to be asserted, per the RDF MT,
>but still needs to be interpreted in terms of the relevant triples in
>order to obtain and test the signature against the authority.
>
>In all cases, these special interpretations are bound to specific
>predicates, so hopefully that helps capture what's needed in the MT.
>
>>
>>>
>>>We may even wish to find some other term than 'bootstrapping'
>>>(perhaps 'anchoring') to get further away from the phase/process
>>>connotations that bootstrapping drags into the mix...
>>
>>Yes, I like anchoring. Or, another term used for this is 
>>'grounding'. The meanings of this stuff have to be grounded in the 
>>real world of agency, assertion and trust somehow.
>
>I'm fine with you choosing some other term(s), so long as the
>end result is the same.

I think it isn't. The point of anchoring or grounding is that you 
don't get to bootstrap. The authentication chain has to end in an 
authority, not in a self-referential loop.

Pat

>
>I use the term bootstrapping because I'm a software engineer and
>that is what appears to be happening -- particularly in the case
>of assertion -- but 'anchoring' or 'grounding' may very
>well be more appropriate.
>
>Patrick
>
>--
>
>Patrick Stickler
>Nokia, Finland
>patrick.stickler@nokia.com


-- 
---------------------------------------------------------------------
IHMC	(850)434 8903 or (650)494 3973   home
40 South Alcaniz St.	(850)202 4416   office
Pensacola			(850)202 4440   fax
FL 32501			(850)291 0667    cell
phayes@ihmc.us       http://www.ihmc.us/users/phayes
Received on Tuesday, 16 March 2004 18:53:28 UTC