Re: Named graphs etc from Pat Hayes on 2004-03-11 (www-archive@w3.org from March 2004)

From: Pat Hayes <phayes@ihmc.us>
Date: Thu, 11 Mar 2004 14:00:12 -0600
To: Patrick Stickler <patrick.stickler@nokia.com>
Cc: <www-archive@w3.org>, "ext Jeremy Carroll" <jjc@hplb.hpl.hp.com>, <chris@bizer.de>
Message-Id: <p06001f02bc76639d0728@[10.0.100.76]>
>On Mar 10, 2004, at 18:22, ext Pat Hayes wrote:
>
>>>Either there should be a bootstrapping vocabulary (easier to introduce
>>>since it's disjunct from the existing specs) or some attribute on the
>>><rdf:RDF> element to explicitly define assertion.
>>
>>Or something. I really don't see how this can possibly be done just 
>>by using vocabulary, however. (What gets that vocabulary asserted?? 
>>If we stipulate this semantically by saying that the vocabulary is 
>>self-asserting, then we are going outside the RDF spec in any case.)
>
>Of course.
>
>But adding an XML attribute to indicate assertion is *also* going
>outside the RDF spec.

Well, it does something else, but it doesn't actually break the 
current spec (does it??)

>Anything we do here will be outside the RDF spec. The goal, IMO, is
>to do it as little outside the spec as possible, and in an way that
>allows us to employ as much of the RDF machinery as possible.

We agree on the principle, but not I think on what counts as 'as 
little as possible'.

>So, having a bootstrapping vocabulary which has a special interpretation
>(a pre-intepretation not defined by the RDF specs) but which, after
>such a pre-interpretation phase, remains fully valid in terms of the
>RDF specs seems like a big win.
>
>Think of it as a special validation test for a graph which tests
>assertion and authentication.
>
>This is similar to how OWL provides for various forms of validation
>testing over and above what is defined by the RDF or RDFS specifications.
>Neither RDF nor RDFS have a clue about cardinality, yet an agent
>that understands the OWL vocabulary terms for cardinality constraints
>can test a graph to see if it is valid per those constraints.

But all RDF reasoning is still valid in OWL. OWL just adds more 
information. If I follow your idea (Im not sure I do) then it has a 
different quality.

>Agents that are savvy about the bootstrapping vocabulary and its
>special semantics can apply their tests prior to accepting a graph.
>If the graph is determined to be asserted and authentic per the
>bootstrapping statements within that particular graph, then they
>continue with full RDF/OWL interpretation of the graph.
>
>Other non-savvy agents simply don't know what those vocabulary
>terms mean, and their presence in a graph interpreted by such
>an agent are thus innocuous.

Suppose that they, using valid RDF/OWL reasoning, are able to draw an 
innocuous conclusion that would have affected the pre-processing if 
it had been around when that pre-processing was done. What happens?

>Is this not exactly how RDF is meant to be extensible?
>
>>>
>>>The tricky part (or maybe it's easy) and what I'd like Pat to comment
>>>on is how hard (or even possible) it is to constrain the interpretation
>>>of a particular property to the graph in which the statement occurs.
>>
>>That can be done,
>
>That's encouraging to hear ;-)
>
>>though it requires tweaking the MT in unconventional ways. I did 
>>this for the OWL imports vocabulary but the WG decided that it was 
>>too complicated and 'weird' to put in the spec, so they gave it an 
>>'operational'  spec outside the MT.
>>
>
>OK. Perhaps the way to do this would be to define a similar 'operational'
>spec, detailing the bootstrapping interpretation alone, and not 
>having to touch
>the RDF or OWL MTs at all.
>
>???
>
>>But I really don't think we should do this. Allowing a graph to 
>>assert another graph is potentially very useful and natural.
>
>I just have a hard time seeing this as "a good thing". I can
>think of numerous use cases where I would want to be sure that
>*no* other graph can affect the origin/nature/authenticity of
>my graphs insofar as what I authoritatively state to be so for
>my graphs.

I agree, and I'm not suggesting that. But you can't stop others from 
referring to the content in your graphs, not should you want to< i 
suggest (its often very handy). If you give your graph a name, then 
their use of that name (as opposed to YOUR name) only refers to that 
content.

>
>I think it's alot safer/simpler to have graphs be "self qualifying".

Well, its certainly not simpler, and I don't think its any safer, see 
reply to earlier message.

>
>>Don't think of 'being asserted' as a property of graphs: think of 
>>it as a relationship between something and a graph. If I assert 
>>your graph, I havnt done anything to you or your graph except 
>>borrow your RDF. I could have gotten the same effect by importing 
>>your graph into a blank one of mine and then asserting it.
>
>True. And that's the way RDF works today. But if we are to move beyond that,
>to authentication, trust, accountability, etc. we have to have something
>a little more robust and which preserves the integrity of each graph, until
>such time as the graph looses its identity (by merge/modification/etc.).

But we won't get anything robust enough to hang trust on by tweaking 
with references WITHIN graphs. Its too easy to tweak content, and too 
hard to stop it ramifying in unexpected ways through inference 
engines.  We will need to anchor trust in something that identifies 
the agent outside the RDF content itself. I thought a simple trick 
would do it, but I now think that we should just punt on this, and 
say that it will get done somehow. All we need to do is to provide 
the place to plug it into the semantics.

My suggested plug-in is, if a graph claims that ex:agent asserts foo, 
then provided that this graph can be securely traced to ex:agent, 
then indeed it is the case that foo is asserted by ex:agent. How to 
do the secure tracing, we leave for others to specify. Maybe a future 
TAG group will invent a way to do that, or something. At least we 
have focussed attention on what needs to be done (securely trace a 
graph to its web-agent 'owner').

The only real difference between us at this point is that you want 
the self-referring graph to be the anchor, and I want the reference 
to the agent by the agent to be the anchor. Your is easier to specify 
(now), but I think that this is misleading as its not really 
specified because 'publisher' of a graph isn't fully specified; and 
yours is more 'syntactic' since graph self-reference can be 
recognized easily. But I don't trust it; it seems ad-hoc; and mine 
doesn't involve any RDF syntactic or semantic tweaking. Also I don't 
like the way that yours depends so much on graph identity, which we 
havn't really yet thought through well enough to rely on to this 
extent, IMO.

>>>It's a little more complicated than an XML attribute, but has
>>>that great advantage of being immediately compatible with all
>>>RDF serializations
>>
>>It might be syntactically compatible, but it plays hell with the 
>>semantics (though we could fudge round that in an acceptable way, 
>>to be honest).
>
>I had hoped so ;-)

Yeh, well, Im not so sure about that any more.

>
>>And it still requires that the published documents be edited:
>
>Not necessarily. Legacy RDF, and those not using the new machinery,
>can still be presumed to be asserted by agents -- with some agents
>simply being more picky/demanding and only accepting graphs using
>the new machinery.
>
>But true, if someone has a particular graph that they want to
>qualify with the new machinery, then they need to add the bootstrapping
>statements to the graph.
>
>But hey, that's not really such a big deal, since many/most graphs are
>"living documents" anyway.
>
>>whereas allowing A to assert B means that we can get existing 
>>graphs asserted without changing them at all.
>
>Yes, but IMO at far too high a cost, due to the potential confusion
>that can arise and extra work to resolve it when trying to chase down
>authentication/trust trails about which graph asserts which graph
>which the asserts another graph, etc.

We can short this out by saying that assertion isn't transitive, 
which is intuitively correct in any case. Real trust engines will 
have to be able to root their assertion-confidence in some way, so as 
to be able to trace an ultimate agent.

>Agents will always have the liberty to disregard, or guess about,
>the intentions of the author of a graph, but I think it will be
>critical to maintain a clear distinction between the intentions
>of the author and the opinions or guesses of third parties.

Sure, we do this very simply, by having every description of agent 
actions referring to the agent.

>
>E.g. if I explicitly state that my graph is not asserted, and someone
>else says it is, and that results in my being sued for slander, I
>can simply say, sorry, I never asserted that. And the evidence
>regarding my original intent for the graph would be explicit and
>unambiguous.

Right, exactly: but how is that possible when the only connection 
between the graph and you is one of denotation? Anyone can refer to 
anything.

>
>Also, since RDF does not provide for the preservation of graph
>membership, such an approach would *require* that all RDF stores
>be updated to preserve such membership,

But that is completely unfeasible. We can't require this for the 
entire SWeb: it will never fly.

>otherwise the statements
>in one graph about another graph become useless. How can you
>know which statements are in graph A and thus asserted because
>of a statement in graph B when you don't anymore know which
>statements actually were in graph A or graph B?!

This problem arises because you have conflated 'asserted by' with 'in 
a (special kind of) graph'. Keep these ideas separate, and graphs are 
both more conventional and more use. The point of naming is to be 
able to CONVEY assertion from one graph (securely anchored and owned 
by an agent) to another (which only needs to be securely named, and 
URIs will probably suffice for this in all the cases except those 
where you need a lawyer present in any case.)

>With the bootstrapping vocabulary, having self qualifying graphs,
>agents can test assertion and authenticity prior to syndication
>into a traditional RDF triples store, and not worry about loss
>of graph distinction in later processing/reasoning.

But it requires the secure layer to use entirely different rules from 
the rest of the Web; whereas it seems to me that the great attraction 
of a naming scheme is that it allows the security of the trust layer 
to refer to the big scruffy Web, without sacrificing security. So it 
can provide a kind of trust-superstructure that gives an untrusted 
web a kind of checkable backbone.

>  It is thus,
>a more backwards compatible approach to bootstrapping the
>authentication and trust layers which can provide alot of utility
>even to agents that don't employ knowledge stores that preserve
>graph membership.
>
>Patrick
>
>--
>
>Patrick Stickler
>Nokia, Finland
>patrick.stickler@nokia.com


-- 
---------------------------------------------------------------------
IHMC	(850)434 8903 or (650)494 3973   home
40 South Alcaniz St.	(850)202 4416   office
Pensacola			(850)202 4440   fax
FL 32501			(850)291 0667    cell
phayes@ihmc.us       http://www.ihmc.us/users/phayes
Received on Thursday, 11 March 2004 15:00:25 UTC