Re: Named graphs etc from Patrick Stickler on 2004-03-16 (www-archive@w3.org from March 2004)

From: Patrick Stickler <patrick.stickler@nokia.com>
Date: Tue, 16 Mar 2004 13:17:25 +0200
To: "ext Jeremy Carroll" <jjc@hplb.hpl.hp.com>
Cc: "ext Chris Bizer" <chris@bizer.de>, <www-archive@w3.org>, "ext Pat Hayes" <phayes@ihmc.us>
Message-Id: <80C4F4D1-773B-11D8-B709-000A95EAFCEA@nokia.com>
On Mar 16, 2004, at 12:30, ext Jeremy Carroll wrote:

>
>> All we have to say is that, given certain bits of information:
>>
>> 1. The URI denoting a graph
>> 2. The URI denoting an authority
>> 3. The signature associated with a graph.
>>
>> we have what we need to authenticate that graph per that authority, 
>> and
>> check if they said what the graph expresses (regardless of whether 
>> they
>> assert it).
>>
>> If the PKI machinery cannot conclude, given the above information,
>> that the graph is authentic per that authority (for whatever reason,
>> maybe a server was down or a signature expired, etc.) then that is too
>> bad for the particular agent trying to verify a graph, but doesn't
>> invalidate the basic model.
>>
>> All that matters is that we have the identity of a graph, the identity
>> of an authority, and some signature to test their valid relationship.
>>
>
> If I have understood Pat, the MT could embed such a relationship ...
>
> thus
>
> eg:g rdfg:hasSignature "...."
>
> can either have operational semantics, RDF processors may go off and 
> check
> this, somewhat like owl:imports, or it may have formal semantics i.e. 
> it is
> true iff the bytes are a signature of the graph according to some other
> document. I think it would be good to do the latter - a point which I 
> had
> been missing.

We may need both operational semantics and formal semantics -- i.e.
the operational semantics tells the processor what statements are
relevant to determine if a graph is authentic and/or asserted,
and the results of that determination feed into whether a given
graph is true or false, etc.

Eh?

>
>
>>
>> Thus, insofar as the RDF/OWL MTs are concerned, some graph where
>>
>>     ?graph ( ?graph rdfg:assertedBy ?authority .
>>              ?graph rdfg:signature  ?signature . )
>>
>> will not be automagically asserted or authenticated.
>>
>
> A nice thing about signatures is that the second triple cannot be
> automagically generated (well not correctly), because the literal 
> cannot be
> spoofed. Thus embedding it in the formal semantics would allow me to do
>
>   my:sig rdfs:subPropertyOf rdfg:signature
>
> and then use my:sig instead of rdfg:signature, and everything would be
> kosher and this would be robust against forged signatures (thanks to 
> the
> crypto technology).

Right.

In fact, rdfg:signature could simply have a formal semantics
defined, and then specific subproperties could have operational
semantics defined, such that the subproperty differentiates the
particular type of signature being used and what the processor
has to do to validate it.

???

>
>
>
>> I think we agree here, but are having a disconnect of focus.
>
> That's my impression too.
>
>>
>> So, on the one hand, we have certain claims being expressed in
>> the various graphs. Some of those claims/statements provide some
>> information by which the authenticity of those claims can be
>> tested. Since we are interpreting those claims as valid/asserted
>> claims in order to actually test those claims, it is a form
>> of "bootstrapping".
>>
>> Ultimately, if the tests fail, then we reject those claims as
>> invalid or untrustworthy -- essentially as not being claims at
>> all, just noise.
>>
>> Yes?
>
> That's what I understand.

Great. Light at the end of the tunnel (let's hope the tunnel doesn't
suddenly get longer... ;-)

>
>>> Indeed. But we can extend the MT to give you a real place to
>>> terminate. I thought that was what you wanted me in on the project to
>>> do :-)
>>
>> Naahhh. We were just bored and wanted some excitement... ;-)
>
> I don't think I was particularly aiming at one thing or another .. the 
> TriX
> paper was weak because it presented a syntax with no semantics, and it
> certainly is good to have Pat on board to help with that step - I am 
> still
> not committed to any particular solution there though.
>
>>
>> Really, though, what we do want is *some* MT (either distinct from or
>> an extension
>> to the RDF MT) which provides for the special intra-graph
>> interpretations needed
>> to bootstrap the assertion and authentication per statements in the
>> graph
>> itself.
>
> We need some thought at the semantic/MT level and some thought at the
> operational level - I am not trying to prejudge what the answer is. I 
> think
> we can get overly concerned about the termination and grounding 
> problems.

I think I'm probably pushing for more of a tangible solution that
the rest of you, due I'm sure to my practical "build it so it will work"
mentality.

Not that everything has to be fully cooked, but I still would like
to see all the ingredients in there somehow.

>
>
>>> Well it was that layer of preprocessing stuff that seemed 
>>> problematic,
>>> for the reasons I suggested. Suppose to take a very simple example,
>>> you have OWL statements that a class C has cardinality one and that
>>> ex:thisURI and ex:thatURI are both in it and that ex:thisURI is the
>>> name of a graph, and that ex:thatURI is asserted. It follows that the
>>> graph is asserted, but you won't know that by inspecting the URIs
>>> unless you are very OWL-savvy. Now suppose that the graph doesnt have
>>> the cardinality info in it but you discover it a month later. Now 
>>> make
>>> the reasoning arbitrarily more complicated.
>>
>> Right. OK.
>>
>> So different agents will be able to make different determinations 
>> about
>> certain graphs depending on their ability/inability to do OWL 
>> reasoning.
>>
>> But is that really breaking anything (as opposed to simply making 
>> things
>> more complicated for certain agents -- which OWL does anyway ;-)
>
> It seems to me that if we adequately articulate the bootstrapping 
> problem
> then someone who choose to publish their statements of assertion and
> statements of signatures in a way that is difficult to understand will
> narrow their audience. But that happens all the time - e.g. academics 
> will
> throw insults at one another in footnotes that require a load of 
> processing
> to understand - and part of that is to narrow the audience. It might 
> even be
> useful - e.g. if I want to publish my e-mail address I can put an
> arbitrarily complicated OWL derivation in order to get it, this might 
> act as
> a block on spammers (of course, they already know my e-mail address).
> Anyone the point is that the vocabulary does not intrinsically need to 
> be
> used in a special way to enable bootstrapping, but merely the more
> straightforward usage will be understood more clearly.

Hmmmm... I'm not sure I 100% agree with this, but let's pretend for the
moment that I do ;-)


>>>>
>>>> Defining the interpretation/testing of that special information,
>>>> expressed as statements in the graph, need not intersect nor impact
>>>> the RDF or OWL MTs.
>>>
>>> The issue is how to STOP it being involved with those MTs. I don't 
>>> see
>>> how that would be possible.
>>
>> Well, my original idea was that agents would be able to consider
>> graphs in terms of a specialized, narrower MT than RDF/OWL which
>> was just sufficient to allow them to make determinations about
>> assertion and authenticity per the special vocabulary.
>>
>> I.e. the special MT wouldn't presume the full RDF/OWL MTs.
>>
>
> I think we can work on an orthogonal extension, which can work with 
> RDFS or
> OWL - probably not worth the effort to consider RDF (without S).
>
>> Sort of like having a zoom lens on a camera. To test
>> assertion/authenticity,
>> you zoom in to apply a narrow specialized MT, and then for the rest of
>> your
>> processing (if satisfied with the tests of assertion/authenticity)
>> you zoom out to apply the wider RDF/OWL MTs.
>>
>> The statements you zoomed in on for the narrow shot are still there
>> in the wider shot, but some "special" detail may simply not be visible
>> from the wider view.
>>
>> Just a thought...
>
> I see that as a publisher's choice.

Well, it's a publisher's choice what machinery they choose to use
to indicate assertion/authenticity -- but ideally there would be
a well defined model/methodology to do so which most publishers
and agents would both use -- and that requires a reasonable
definition of how those "bootstrapping" interpretations are done.

As shown in numerous examples, a bunch of statements and the RDF
and OWL MTs don't get you there. You end up either with the
chicken/egg question (how can a graph that is not asserted contain
a statement that asserts it) or the authenticity question (how do
we know that the authority of a graph as identified in a graph
actually is the origin of the graph).

I think what we need to do is to (eventually) provide a model
that publishers will want to use because it provides useful
answers to the above two questions.

Patrick


--

Patrick Stickler
Nokia, Finland
patrick.stickler@nokia.com
Received on Tuesday, 16 March 2004 06:18:33 UTC