Re: RDF named graph use case and requirement from Graham Klyne on 2011-09-23 (public-prov-wg@w3.org from September 2011)

From: Graham Klyne <GK@ninebynine.org>
Date: Fri, 23 Sep 2011 07:26:34 +0100
To: Luc Moreau <L.Moreau@ecs.soton.ac.uk>
CC: W3C provenance WG <public-prov-wg@w3.org>
Message-ID: <4E7C269A.1000606@ninebynine.org>
Luc,

On 22/09/2011 07:48, Luc Moreau wrote:
 > Hi Graham,
 >
 > I disagree. The problem is not specific to named graphs at all. You can't 
prevent people from using the same name for different things (eg stateful resource).


It's an RDF thing.

You say: "You can't prevent people from using the same name for different 
things".  In RDF, wherever you use the same name (i.e. URI or literal or bnode) 
in a graph, it *always* denotes the same resource.  That much is fixed by the 
RDF formal semantics.

...

If I say

   :a :description "description of something" .
   :b :description "description of something" .

then we have an RDF graph with 3 (not 4) nodes.  The strings denote the same 
thing (a string) in the domain of interpretation.  That's how RDF semantics works.

Thus both :a and :b have the *same* value for their :description property.

If one wants to be able to treat these as two different descriptions made by 
different people, even if they have the same text, then one must choose a 
different modeling approach in RDF, such as:

   :a :description [ :hasText "description of something" ] .
   :b :description [ :hasText "description of something" ] .

or, not using bnodes:

   :a :description :adesc .
   :adesc :hasText "description of something" .
   :b :description :bdesc .
   :bdesc :hasText "description of something" .

go give an RDF graph with 5 nodes.  In this case, although the description texts 
are the same there are two distinct description nodes in the RDF graph.

But we need to be clear that, as far as RDF is concerned, these are both valid 
modelling choices, and it can be important to be aware of the consequences, and 
how they interact with the semantics of RDF in general, and literals in particular.

So, going back to the original message, all I'm saying is that if the RDF group 
choose a "graphs as literals" approach, this may require us to adopt different 
RDF modelling style for provenance than if they choose, say, an RDF dataset 
model based on the SPARQL dataset specification.

#g
--

On 22/09/2011 07:48, Luc Moreau wrote:
> Hi Graham,
>
> I disagree. The problem is not specific to named graphs at all. You can't prevent people from using the same name for different things (eg stateful resource).
>
> you need to organise your metadata appropriately.
>
> Given the representation you have chosen here, I find the inference valid ... though undesirable.
>
> Professor Luc Moreau
> Electronics and Computer Science
> University of Southampton
> Southampton SO17 1BJ
> United Kingdom
>
>
> On 22 Sep 2011, at 06:22, "Graham Klyne"<GK@ninebynine.org>  wrote:
>
>> Because if the graphs were not distinct, properties could not be applied to them separately.
>>
>> #g
>> --
>>
>> On 21/09/2011 22:07, Luc Moreau wrote:
>>> Hi Graham,
>>>
>>> Why is this a requirement on named graphs and not on their metadata?
>>>
>>> Didn't you want to encode
>>>
>>> (by A and on d1) or (by B and on d2)
>>>
>>> but you seem to have encoded it as
>>>
>>> (by A and on d1) and (by B and on d2)
>>>
>>> which seems to allow the inference you describe.
>>>
>>>
>>> I think you have made the case for provenance to be put in a container (see latest spec). in this example you would need twoseparate containers, to avoid mix and match of provenance statements.
>>>
>>> Professor Luc Moreau
>>> Electronics and Computer Science
>>> University of Southampton
>>> Southampton SO17 1BJ
>>> United Kingdom
>>>
>>>
>>> On 21 Sep 2011, at 16:51, "Graham Klyne"<GK@ninebynine.org>   wrote:
>>>
>>>> (I've also posted this summary at http://www.w3.org/2011/prov/wiki/ProvenanceRDFNamedGraph#Requirement_from_discussion_with_Andy_Seaborne)
>>>>
>>>> In a meeting with Andy Seaborne this morning, we discussed provenance requirements and RDF named graphs, in light of some options that the RDF group might be considering.
>>>>
>>>> The resulting requirement that we articulated was that for the purposes of provenance, we must be able to treat two "named" graphs with identical graph content as two distinct entities.
>>>>
>>>> ...
>>>>
>>>> The use-case is this:
>>>>
>>>> Suppose we have some resource R.
>>>>
>>>> Observer A makes a provenance assertion about R on Monday 2011-09-19, which is expressed as an RDF graph Pra
>>>>
>>>> Observer B makes a provenance assertion about R on Friday 2011-09-23, expressed as RDF graph Prb
>>>>
>>>> To express provenance about the provenance assertions, we may wish to say:
>>>>
>>>> Pra statedBy A; onDate "2011-09-19" .
>>>>
>>>> Prb statedBy B; onDate "2011-09-23" .
>>>>
>>>> It may be that the provenance assertions Pra and Prb have identical content; i.e. they are RDFG graphs containing identical triple sets.  For the purposes of provenance recording, it is important that even when they express the same graphs, Pra and Prb are distinct RDF nodes.  If Pra and Prb are treated as a common RDF node, one might then infer:
>>>>
>>>> _:something statedBy A ; onDate "2011-09-23" .
>>>>
>>>> which in this scenario would be false.
>>>>
>>>> .....
>>>>
>>>> A particular consequence of this is that an RDF "named graph" specification based on graph literals (where RDF literals are self-denoting), somewhat like formulae in Notation 3, would have to be used with care.  That is, if Pra and Prb are graph literals, then Pra = Prb, and the given provenance-of-provenance statements could not be expressed as suggested above.
>>>>
>>>> (This does not preclude a graph literal approach being used, but the above use-case might need to be constructed slightly differently.)
>>>>
>>>> #g
>>>> --
>>>>
>>>
>>>
>
>
Received on Friday, 23 September 2011 07:13:34 UTC