Re: RDF named graph use case and requirement from Luc Moreau on 2011-09-23 (public-prov-wg@w3.org from September 2011)

From: Luc Moreau <l.moreau@ecs.soton.ac.uk>
Date: Fri, 23 Sep 2011 10:22:56 +0100
To: public-prov-wg@w3.org
Message-ID: <EMEW3|cb7ee3903d5ba6ac8a377da377cda20fn8MANq08l.moreau|ecs.soton.ac.uk|4E7C4FF0>
Hi Graham,
Thanks for the clear explanation.

While in RDF a same name *always* denotes  the *same* resource, it is not
necessarily the case that the same resource remains unchanged.

I am saying that for provenance we  already have to deal with resources 
that change, despite
their name remaining the same. Should the RDF WG opt for the option of 
representing
graphs with same assertions as a single node, it seems to me that this 
does not change anything
for us, from a provenance perspective.

Cheers,
Luc


On 23/09/2011 07:26, Graham Klyne wrote:
> Luc,
>
> On 22/09/2011 07:48, Luc Moreau wrote:
> > Hi Graham,
> >
> > I disagree. The problem is not specific to named graphs at all. You 
> can't prevent people from using the same name for different things (eg 
> stateful resource).
>
>
> It's an RDF thing.
>
> You say: "You can't prevent people from using the same name for 
> different things".  In RDF, wherever you use the same name (i.e. URI 
> or literal or bnode) in a graph, it *always* denotes the same 
> resource.  That much is fixed by the RDF formal semantics.
>
> ...
>
> If I say
>
>   :a :description "description of something" .
>   :b :description "description of something" .
>
> then we have an RDF graph with 3 (not 4) nodes.  The strings denote 
> the same thing (a string) in the domain of interpretation.  That's how 
> RDF semantics works.
>
> Thus both :a and :b have the *same* value for their :description 
> property.
>
> If one wants to be able to treat these as two different descriptions 
> made by different people, even if they have the same text, then one 
> must choose a different modeling approach in RDF, such as:
>
>   :a :description [ :hasText "description of something" ] .
>   :b :description [ :hasText "description of something" ] .
>
> or, not using bnodes:
>
>   :a :description :adesc .
>   :adesc :hasText "description of something" .
>   :b :description :bdesc .
>   :bdesc :hasText "description of something" .
>
> go give an RDF graph with 5 nodes.  In this case, although the 
> description texts are the same there are two distinct description 
> nodes in the RDF graph.
>
> But we need to be clear that, as far as RDF is concerned, these are 
> both valid modelling choices, and it can be important to be aware of 
> the consequences, and how they interact with the semantics of RDF in 
> general, and literals in particular.
>
> So, going back to the original message, all I'm saying is that if the 
> RDF group choose a "graphs as literals" approach, this may require us 
> to adopt different RDF modelling style for provenance than if they 
> choose, say, an RDF dataset model based on the SPARQL dataset 
> specification.
>
> #g
> -- 
>
> On 22/09/2011 07:48, Luc Moreau wrote:
>> Hi Graham,
>>
>> I disagree. The problem is not specific to named graphs at all. You 
>> can't prevent people from using the same name for different things 
>> (eg stateful resource).
>>
>> you need to organise your metadata appropriately.
>>
>> Given the representation you have chosen here, I find the inference 
>> valid ... though undesirable.
>>
>> Professor Luc Moreau
>> Electronics and Computer Science
>> University of Southampton
>> Southampton SO17 1BJ
>> United Kingdom
>>
>>
>> On 22 Sep 2011, at 06:22, "Graham Klyne"<GK@ninebynine.org>  wrote:
>>
>>> Because if the graphs were not distinct, properties could not be 
>>> applied to them separately.
>>>
>>> #g
>>> -- 
>>>
>>> On 21/09/2011 22:07, Luc Moreau wrote:
>>>> Hi Graham,
>>>>
>>>> Why is this a requirement on named graphs and not on their metadata?
>>>>
>>>> Didn't you want to encode
>>>>
>>>> (by A and on d1) or (by B and on d2)
>>>>
>>>> but you seem to have encoded it as
>>>>
>>>> (by A and on d1) and (by B and on d2)
>>>>
>>>> which seems to allow the inference you describe.
>>>>
>>>>
>>>> I think you have made the case for provenance to be put in a 
>>>> container (see latest spec). in this example you would need 
>>>> twoseparate containers, to avoid mix and match of provenance 
>>>> statements.
>>>>
>>>> Professor Luc Moreau
>>>> Electronics and Computer Science
>>>> University of Southampton
>>>> Southampton SO17 1BJ
>>>> United Kingdom
>>>>
>>>>
>>>> On 21 Sep 2011, at 16:51, "Graham Klyne"<GK@ninebynine.org>   wrote:
>>>>
>>>>> (I've also posted this summary at 
>>>>> http://www.w3.org/2011/prov/wiki/ProvenanceRDFNamedGraph#Requirement_from_discussion_with_Andy_Seaborne) 
>>>>>
>>>>>
>>>>> In a meeting with Andy Seaborne this morning, we discussed 
>>>>> provenance requirements and RDF named graphs, in light of some 
>>>>> options that the RDF group might be considering.
>>>>>
>>>>> The resulting requirement that we articulated was that for the 
>>>>> purposes of provenance, we must be able to treat two "named" 
>>>>> graphs with identical graph content as two distinct entities.
>>>>>
>>>>> ...
>>>>>
>>>>> The use-case is this:
>>>>>
>>>>> Suppose we have some resource R.
>>>>>
>>>>> Observer A makes a provenance assertion about R on Monday 
>>>>> 2011-09-19, which is expressed as an RDF graph Pra
>>>>>
>>>>> Observer B makes a provenance assertion about R on Friday 
>>>>> 2011-09-23, expressed as RDF graph Prb
>>>>>
>>>>> To express provenance about the provenance assertions, we may wish 
>>>>> to say:
>>>>>
>>>>> Pra statedBy A; onDate "2011-09-19" .
>>>>>
>>>>> Prb statedBy B; onDate "2011-09-23" .
>>>>>
>>>>> It may be that the provenance assertions Pra and Prb have 
>>>>> identical content; i.e. they are RDFG graphs containing identical 
>>>>> triple sets.  For the purposes of provenance recording, it is 
>>>>> important that even when they express the same graphs, Pra and Prb 
>>>>> are distinct RDF nodes.  If Pra and Prb are treated as a common 
>>>>> RDF node, one might then infer:
>>>>>
>>>>> _:something statedBy A ; onDate "2011-09-23" .
>>>>>
>>>>> which in this scenario would be false.
>>>>>
>>>>> .....
>>>>>
>>>>> A particular consequence of this is that an RDF "named graph" 
>>>>> specification based on graph literals (where RDF literals are 
>>>>> self-denoting), somewhat like formulae in Notation 3, would have 
>>>>> to be used with care.  That is, if Pra and Prb are graph literals, 
>>>>> then Pra = Prb, and the given provenance-of-provenance statements 
>>>>> could not be expressed as suggested above.
>>>>>
>>>>> (This does not preclude a graph literal approach being used, but 
>>>>> the above use-case might need to be constructed slightly 
>>>>> differently.)
>>>>>
>>>>> #g
>>>>> -- 
>>>>>
>>>>
>>>>
>>
>>
>
Received on Friday, 23 September 2011 09:24:21 UTC