Re: Named graphs etc from Patrick Stickler on 2004-03-12 (www-archive@w3.org from March 2004)

From: Patrick Stickler <patrick.stickler@nokia.com>
Date: Fri, 12 Mar 2004 12:31:46 +0200
To: "ext Pat Hayes" <phayes@ihmc.us>
Cc: <www-archive@w3.org>, "ext Jeremy Carroll" <jjc@hplb.hpl.hp.com>, <chris@bizer.de>
Message-Id: <76DB10F8-7410-11D8-BD2B-000A95EAFCEA@nokia.com>
On Mar 11, 2004, at 22:00, ext Pat Hayes wrote:

>
>> On Mar 10, 2004, at 18:22, ext Pat Hayes wrote:
>>
>>>> Either there should be a bootstrapping vocabulary (easier to 
>>>> introduce
>>>> since it's disjunct from the existing specs) or some attribute on 
>>>> the
>>>> <rdf:RDF> element to explicitly define assertion.
>>>
>>> Or something. I really don't see how this can possibly be done just 
>>> by using vocabulary, however. (What gets that vocabulary asserted?? 
>>> If we stipulate this semantically by saying that the vocabulary is 
>>> self-asserting, then we are going outside the RDF spec in any case.)
>>
>> Of course.
>>
>> But adding an XML attribute to indicate assertion is *also* going
>> outside the RDF spec.
>
> Well, it does something else, but it doesn't actually break the 
> current spec (does it??)

It would break implementations compliant with the current spec as they
would have to update their parser implementations.

>
>> Anything we do here will be outside the RDF spec. The goal, IMO, is
>> to do it as little outside the spec as possible, and in an way that
>> allows us to employ as much of the RDF machinery as possible.
>
> We agree on the principle, but not I think on what counts as 'as 
> little as possible'.
>
>> So, having a bootstrapping vocabulary which has a special 
>> interpretation
>> (a pre-intepretation not defined by the RDF specs) but which, after
>> such a pre-interpretation phase, remains fully valid in terms of the
>> RDF specs seems like a big win.
>>
>> Think of it as a special validation test for a graph which tests
>> assertion and authentication.
>>
>> This is similar to how OWL provides for various forms of validation
>> testing over and above what is defined by the RDF or RDFS 
>> specifications.
>> Neither RDF nor RDFS have a clue about cardinality, yet an agent
>> that understands the OWL vocabulary terms for cardinality constraints
>> can test a graph to see if it is valid per those constraints.
>
> But all RDF reasoning is still valid in OWL. OWL just adds more 
> information. If I follow your idea (Im not sure I do) then it has a 
> different quality.

As I mentioned in an earlier response, my 'bootstrapping interpretation'
is akin to a test on the graph syntax, the explicit triples in the 
graph,
per a given vocabulary. The statements in question remain true for all
RDF/OWL interpretations, but do not in RDF/OWL function to fully provide
for those tests of authenticity/assertiveness/etc.

It's like testing if an RDF graph is "well formed" versus whether
it is semantically valid. (don't know if this helps or hurts)

>
>> Agents that are savvy about the bootstrapping vocabulary and its
>> special semantics can apply their tests prior to accepting a graph.
>> If the graph is determined to be asserted and authentic per the
>> bootstrapping statements within that particular graph, then they
>> continue with full RDF/OWL interpretation of the graph.
>>
>> Other non-savvy agents simply don't know what those vocabulary
>> terms mean, and their presence in a graph interpreted by such
>> an agent are thus innocuous.
>
> Suppose that they, using valid RDF/OWL reasoning, are able to draw an 
> innocuous conclusion that would have affected the pre-processing if it 
> had been around when that pre-processing was done. What happens?

Nothing. Because, if a new statement is inferred, that statement is in
a different graph and has no affect on the bootstrapping interpretation.

>
>> Is this not exactly how RDF is meant to be extensible?
>>
>>>>
>>>> The tricky part (or maybe it's easy) and what I'd like Pat to 
>>>> comment
>>>> on is how hard (or even possible) it is to constrain the 
>>>> interpretation
>>>> of a particular property to the graph in which the statement occurs.
>>>
>>> That can be done,
>>
>> That's encouraging to hear ;-)
>>
>>> though it requires tweaking the MT in unconventional ways. I did 
>>> this for the OWL imports vocabulary but the WG decided that it was 
>>> too complicated and 'weird' to put in the spec, so they gave it an 
>>> 'operational'  spec outside the MT.
>>>
>>
>> OK. Perhaps the way to do this would be to define a similar 
>> 'operational'
>> spec, detailing the bootstrapping interpretation alone, and not 
>> having to touch
>> the RDF or OWL MTs at all.
>>
>> ???
>>
>>> But I really don't think we should do this. Allowing a graph to 
>>> assert another graph is potentially very useful and natural.
>>
>> I just have a hard time seeing this as "a good thing". I can
>> think of numerous use cases where I would want to be sure that
>> *no* other graph can affect the origin/nature/authenticity of
>> my graphs insofar as what I authoritatively state to be so for
>> my graphs.
>
> I agree, and I'm not suggesting that. But you can't stop others from 
> referring to the content in your graphs, not should you want to< i 
> suggest (its often very handy). If you give your graph a name, then 
> their use of that name (as opposed to YOUR name) only refers to that 
> content.

OK. I concede that.

I've perhaps been over-focusing on authoritative qualification of
graphs rather than arbitrary third-party qualification of graphs,
which is also OK.

My problem with the approach where a statement in graph X determines
whether graph Y is asserted is that you lose that distinction between
authoritative and third-party, as well as start chasing potentially
endless chains of statements about assertions.

Assertion and affirmation/agreement should be distinct.

Assertion is something that the owner/publisher/creator of a graph
does. Period. And for efficiently/scalability/sanity reasons, such
assertion should IMO be done in each particular graph itself, if
it is to be done using a vocabulary at all.

>
>>
>> I think it's alot safer/simpler to have graphs be "self qualifying".
>
> Well, its certainly not simpler, and I don't think its any safer, see 
> reply to earlier message.

I think you see it as not simpler/safer because you presume the 
inclusion of
functionality (e.g. reasoning) that I do not propose.

Perhaps some of my recent comments/clarifications will help you see it 
as
truly safer/simpler.


>
>>
>>> Don't think of 'being asserted' as a property of graphs: think of it 
>>> as a relationship between something and a graph. If I assert your 
>>> graph, I havnt done anything to you or your graph except borrow your 
>>> RDF. I could have gotten the same effect by importing your graph 
>>> into a blank one of mine and then asserting it.
>>
>> True. And that's the way RDF works today. But if we are to move 
>> beyond that,
>> to authentication, trust, accountability, etc. we have to have 
>> something
>> a little more robust and which preserves the integrity of each graph, 
>> until
>> such time as the graph looses its identity (by 
>> merge/modification/etc.).
>
> But we won't get anything robust enough to hang trust on by tweaking 
> with references WITHIN graphs. Its too easy to tweak content, and too 
> hard to stop it ramifying in unexpected ways through inference 
> engines.

Again. Here is where we are having a disconnect.

I am presuming that the bootstrapping intepretation will *exclude* any
inference/reasoning, taking the graph at face value.

> We will need to anchor trust in something that identifies the agent 
> outside the RDF content itself.

We anchor trust in the explicit expression of the graph itself, based
only on that subset of the RDF MT that allows us to interpret those
statements as triples, but excluding closure rules, inference, 
reasoning,
etc. which would result in more statements, and hence, a different 
graph.

We might even be able to finesse the MT of the bootstrapping 
interpretation
to be equivalent to that subset of the RDF MT which does not create any
additional statements.

> I thought a simple trick would do it, but I now think that we should 
> just punt on this, and say that it will get done somehow. All we need 
> to do is to provide the place to plug it into the semantics.

Perhaps the simple trick will work now that inference/reasoning is out 
of scope?

>
> My suggested plug-in is, if a graph claims that ex:agent asserts foo, 
> then provided that this graph can be securely traced to ex:agent, then 
> indeed it is the case that foo is asserted by ex:agent. How to do the 
> secure tracing, we leave for others to specify.

Seems like punting entirely on the only bit that's really useful...

> Maybe a future TAG group will invent a way to do that, or something. 
> At least we have focussed attention on what needs to be done (securely 
> trace a graph to its web-agent 'owner').

Seems like that's been pretty obvious for some time (at least in the 
form
of associating statements with their authoritative source -- what we 
then
add is simply that such can be done by named graphs, but it seems we 
could
do alot more...)

>
> The only real difference between us at this point is that you want the 
> self-referring graph to be the anchor, and I want the reference to the 
> agent by the agent to be the anchor. Your is easier to specify (now), 
> but I think that this is misleading as its not really specified 
> because 'publisher' of a graph isn't fully specified; and yours is 
> more 'syntactic' since graph self-reference can be recognized easily. 
> But I don't trust it; it seems ad-hoc; and mine doesn't involve any 
> RDF syntactic or semantic tweaking. Also I don't like the way that 
> yours depends so much on graph identity, which we havn't really yet 
> thought through well enough to rely on to this extent, IMO.
>
>>>> It's a little more complicated than an XML attribute, but has
>>>> that great advantage of being immediately compatible with all
>>>> RDF serializations
>>>
>>> It might be syntactically compatible, but it plays hell with the 
>>> semantics (though we could fudge round that in an acceptable way, to 
>>> be honest).
>>
>> I had hoped so ;-)
>
> Yeh, well, Im not so sure about that any more.

Maybe you'll give it one last ponder, per my recent comments... ?

>
>>
>>> And it still requires that the published documents be edited:
>>
>> Not necessarily. Legacy RDF, and those not using the new machinery,
>> can still be presumed to be asserted by agents -- with some agents
>> simply being more picky/demanding and only accepting graphs using
>> the new machinery.
>>
>> But true, if someone has a particular graph that they want to
>> qualify with the new machinery, then they need to add the 
>> bootstrapping
>> statements to the graph.
>>
>> But hey, that's not really such a big deal, since many/most graphs are
>> "living documents" anyway.
>>
>>> whereas allowing A to assert B means that we can get existing graphs 
>>> asserted without changing them at all.
>>
>> Yes, but IMO at far too high a cost, due to the potential confusion
>> that can arise and extra work to resolve it when trying to chase down
>> authentication/trust trails about which graph asserts which graph
>> which the asserts another graph, etc.
>
> We can short this out by saying that assertion isn't transitive, which 
> is intuitively correct in any case. Real trust engines will have to be 
> able to root their assertion-confidence in some way, so as to be able 
> to trace an ultimate agent.

Right, and what I'm proposing is that they do so using a special 
interpretation
of the explicit statements of the particular graph in question. In 
essence
there is an XML/syntax like mechanism masquerading as RDF statements, 
and
the graph is tested against those special statements (which may or may 
not
be useful to subsequent RDF processing, but won't hurt in any case).

>
>> Agents will always have the liberty to disregard, or guess about,
>> the intentions of the author of a graph, but I think it will be
>> critical to maintain a clear distinction between the intentions
>> of the author and the opinions or guesses of third parties.
>
> Sure, we do this very simply, by having every description of agent 
> actions referring to the agent.
>
>>
>> E.g. if I explicitly state that my graph is not asserted, and someone
>> else says it is, and that results in my being sued for slander, I
>> can simply say, sorry, I never asserted that. And the evidence
>> regarding my original intent for the graph would be explicit and
>> unambiguous.
>
> Right, exactly: but how is that possible when the only connection 
> between the graph and you is one of denotation? Anyone can refer to 
> anything.

Because the graph would be signed in such a way as to confirm that
I was the one actually asserting those statements.

>
>>
>> Also, since RDF does not provide for the preservation of graph
>> membership, such an approach would *require* that all RDF stores
>> be updated to preserve such membership,
>
> But that is completely unfeasible. We can't require this for the 
> entire SWeb: it will never fly.

Exactly! And my proposal makes no such demands. But your proposal 
whereby
different graphs assert other graphs *must* preserve graph membership
or else *no* RDF processor could then decide if a graph is or is not
asserted or by whom, etc. because all of that has to happen *after*
syndication and hence the knowledge store has to preserve graph
membership!

However, if the tests for assertion/authenticity/etc. are done per an
initial bootstrapping interpretation prior to syndication, then even
if the knowlede store doesn't preserve graph membership of statements,
it doesn't matter, since once the graph is syndicated, there's no
need for most applications.

Thus, per your agreement above, a inter-graph vocabulary approach (one
graph asserting another) which is not constrained to a per-graph basis
will never fly because it mandates changes to every knowledge store.

Thus, both XML attributes and inter-graph vocabulary are ruled out for
practical reasons.

My proposed intra-graph vocabulary approach, however, allows existing
triples-only knowledge stores to continue to be used while still
offering agents the benefits of being able to do testing on incoming
graphs for assertion/authenticity; while also allowing for new
knowledge stores which preserve graph membership of statements and
support graph naming to also support reasoning about both authoritative
and third party statements about graphs.

Eh?

>
>> otherwise the statements
>> in one graph about another graph become useless. How can you
>> know which statements are in graph A and thus asserted because
>> of a statement in graph B when you don't anymore know which
>> statements actually were in graph A or graph B?!
>
> This problem arises because you have conflated 'asserted by' with 'in 
> a (special kind of) graph'. Keep these ideas separate, and graphs are 
> both more conventional and more use. The point of naming is to be able 
> to CONVEY assertion from one graph (securely anchored and owned by an 
> agent) to another (which only needs to be securely named, and URIs 
> will probably suffice for this in all the cases except those where you 
> need a lawyer present in any case.)

It's the "securely anchored and owned by an agent" part that seems to
be key here, and yet not addressed at all by the inter-graph
approach.

>
>> With the bootstrapping vocabulary, having self qualifying graphs,
>> agents can test assertion and authenticity prior to syndication
>> into a traditional RDF triples store, and not worry about loss
>> of graph distinction in later processing/reasoning.
>
> But it requires the secure layer to use entirely different rules from 
> the rest of the Web; whereas it seems to me that the great attraction 
> of a naming scheme is that it allows the security of the trust layer 
> to refer to the big scruffy Web, without sacrificing security. So it 
> can provide a kind of trust-superstructure that gives an untrusted web 
> a kind of checkable backbone.

If your authentication machinery is expressed elsewhere than in RDF, 
then sure.

But I though we were talking about capturing that machinery *in* RDF, in
which case, you have the "dog chasing its tail" problem that I've been
trying to point out.

If every graph has to be asserted by another graph, where do you 
terminate
that sequence?

By constraining the interpretation of bootstrapping statements to each
explicit graph, we ground the authenticity of each graph. And thus
third party statements about other graphs are *also* grounded in the
authority of the *other* graph containing those third party statements.
E.g.

:X
{
   :X trix:asserted "true"^^xsd:boolean .      -> authoritative 
assertion of :X
   :X trix:assertedBy ex:Bob .                 -> authority for :X
   :X trix:signature "..." .                   -> verifiable signature 
for :X+ex:Bob
}

:Y
{
   :X trix:asserted "true"^^xsd:boolean .      -> third-party assertion 
of :X
   :Y trix:asserted "true"^^xsd:boolean .      -> authoritative 
assertion of :Y
   :Y trix:assertedBy ex:Jane .                -> authority for :Y
   :Y trix:signature "..." .                   -> verifiable signature 
for :Y+ex:Jane
}

:Z
{
   :X trix:asserted "false"^^xsd:boolean .     -> third-party 
non-assertion of :X
   :Z trix:asserted "true"^^xsd:boolean .      -> authoritative 
assertion of :Z
   :Z trix:assertedBy ex:Bill .                -> authority for :Z
   :Z trix:signature "..." .                   -> verifiable signature 
for :Z+ex:Bill
}

Note that Bill disagrees with Bob about the assertiveness of :X, but
it's clear (a) who says what and (b) which is the authoritative claim.

Here we have an architecture which allows for the testing of 
authoritative
qualifications of graphs, including 
assertion/authentication/signature/etc.
while at the same time provides for the third-party qualification of 
graphs,
which for systems preserving graph membership of statements, together
provide the foundation for deploying trust policies based on both
authoritative and third party knowledge.

The 'boostrapping interpretation' is not an extension of the RDF/OWL 
MTs,
though should be compatible with it. And bootstrapping is *only* 
meaningful
when applied to a particular graph in terms of the bootstrapping 
interpretation.
After bootstrapping, and accepting a particular graph, an agent can 
employ
RDF/OWL reasoning to apply various other trust policies (if the system
supports preservation of graph membership).

Patrick

--

Patrick Stickler
Nokia, Finland
patrick.stickler@nokia.com
Received on Friday, 12 March 2004 05:32:04 UTC