Re: #rdfms-identity-anon-resources: provenance from Graham Klyne on 2001-07-25 (w3c-rdfcore-wg@w3.org from July 2001)

From: Graham Klyne <Graham.Klyne@Baltimore.com>
Date: Wed, 25 Jul 2001 17:17:28 +0100
To: Brian McBride <bwm@hplb.hpl.hp.com>
Cc: w3c-rdfcore-wg@w3.org
Message-Id: <5.1.0.14.2.20010725163726.034270f0@joy.songbird.com>
Brian,

You said:
>To represent exactly the information recieved, anon resources must
>be a part of the model/abstract syntax.  Yes? No?

I think that to "represent exactly the information received" the entire 
RDF/XML document received must be represented verbatim.  I don't think 
we're going there.

In defining RDF, and in particular its semantics, we are in the process of 
identifying the significant information in a received RDF/XML document.  We 
are making choices about what constitutes "information received" (guided by 
the existing XML and RDF M&S specs).

Signatures present an interesting case, because the most trivial change 
(e.g. adding or removing a space in an XML comment) changes/invalidates the 
signature.  So if we mean to convey the exact meaning of a signature (i.e. 
that a certain key signed a certain sequence of bits) we must keep a copy 
of the bits (or at least a copy of the digest of the bits that was signed).

When one takes a signed XML document and interprets it as RDF, I think one 
needs to make some choices about what it means, which will be some function of:
(a) the information conveyed by the bare RDF, and
(b) the information conveyed by the signature

So suppose I receive a statement:

(1)   _:exists-X hasProperty Y .

which is signed Alice.  Also suppose that I apply Skolemization represent 
this information:

(2)   skolem:12345 hasProperty Y .

where I am certain that nobody will independently use skolem:12345 to 
represent any resource.  Now I am not entitled to say that the statement 
(2) was signed by A, but I might legitimately say that I received some 
document signed by A whose content was interpreted to mean (2).

When I sign a cheque, it is a piece of paper that I sign.  When the cheque 
is presented for payment, the bank interpret the signature to mean that I 
have authorized the payment of some amount to some person.  If someone 
writes a telephone number in the margin of the check, the exact content of 
the document is changed, but the authorization conveyed by the signature is 
not.  Thus, any signature as well as the signed document need to be 
interpreted.

A better cheque-based example might be that some banking transactions no 
longer require that the original signed document be presented to the 
payer's bank.  The signed check can be interpreted into some system, and 
then transferred electronically to the paying bank.  Thus, the paying bank 
receives a different document, but in conveys the same authorization 
(modulo adequate trust and security mechanisms).

So when you say:
>My point here is that the abstract model should be able to represent
>*exactly* the information that SOURCE intended to sign.  Any subsequent
>inferences are the responsibility of the recipient.

I would say that "information ... intended to sign" requires some degree of 
interpretation.  How much is a choice to be made.

#g
--

At 11:18 AM 7/24/01 +0100, Brian McBride wrote:


>pat hayes wrote:
> >
> > >I was chatting with a colleague to day and the subject of anon resources
> > >and provenance came up.
> > >
> > >Assume I recieve the following rdf from some source SOURCE, possibly 
> with a
> > >digital signature:
> > >
> > >  <rdf:Description>
> > >    <foo:bar>foobar</foo:bar>
> > >  </rdf:Description>
> > >  <rdf:Description rdf:about="http://goo">
> > >    <foo:bar><foobar</foo:bar>
> > >  </rdf:Description>
> > >
> > >If were to represent this in my machine as:
> > >
> > >  <gensym:1234> <foo:bar> "foobar" .
> > >  <http://goo>  <foo:bar> "foobar" .
> > >
> > >this is not the information that SOURCE signed - SOURCE made no assertion
> > >about <gensym:1234>.  We'd have to be able to distinguish between the
> > >gensym'd URI and the 'real' one.
> >
> > As I said a while ago,
>
>I'm struggling slowly to understand.  My appologies if I miss the
>significance of some remarks first time around.
>
> > it matters who generates the gensym. If SOURCE
> > uses an anonymous node (existential quantifier) then indeed you are
> > not entitled to replace it with a gensym and say that it represents
> > what SOURCE said.
>
>The syntax clearly allows anonymous resources.  I think you have just
>said that a parser, which on receiving RDF/XML containing anon
>resources, assigned URI's to the anon resources when spitting out the
>triples would not be representing exactly the information it had
>as input.  Are we agreeing on this?
>
>To represent exactly the information recieved, anon resources must
>be a part of the model/abstract syntax.  Yes? No?
>
>
> > That is what the logic says also: the step from the
> > existential to the skolem form is *not* a valid inference, so you
> > would not be correct in saying that conclusion followed from what
> > SOURCE said. However, SOURCE could have replaced his existential with
> > a skolem constant
>
>Could have, but didn't.  I suggest the requirement here is that
>the abstract syntax/model be able to represent *exactly* the information
>conveyed in the syntax.
>
> > ( a newly generated URI, guaranteed different from
> > any other URI anywhere else) and, my claim is, that you would have
> > been told virtually the same content by SOURCE under those
> > circumstances as you would have been by SOURCE telling you his
> > existential. (Not quite *exactly* the same, since you would know, in
> > addition to the fact that the thing exists, that SOURCE was using
> > this URI to refer to it. But that information wouldnt help you make
> > any inferences about the thing that you couldnt have made from the
> > existential.) But indeed, if someone signs a document then it isnt,
> > in general, correct to transfer that signature to another document
> > that differs even in the slightest respect, so the issue of whether
> > the inferences are valid or not may be irrelevant here.
>
>My point here is that the abstract model should be able to represent
>*exactly* the information that SOURCE intended to sign.  Any subsequent
>inferences are the responsibility of the recipient.
>
>Brian

------------------------------------------------------------
Graham Klyne                    Baltimore Technologies
Strategic Research              Content Security Group
<Graham.Klyne@Baltimore.com>    <http://www.mimesweeper.com>
                                 <http://www.baltimore.com>
------------------------------------------------------------
Received on Wednesday, 25 July 2001 12:22:10 UTC