Re: RDF named graph use case and requirement from Sandro Hawke on 2011-09-23 (public-rdf-prov@w3.org from September 2011)

From: Sandro Hawke <sandro@w3.org>
Date: Fri, 23 Sep 2011 11:13:47 -0400
To: Satya Sahoo <satya.sahoo@case.edu>
Cc: Graham Klyne <GK@ninebynine.org>, public-rdf-prov@w3.org, W3C provenance WG <public-prov-wg@w3.org>, Andy Seaborne <andy.seaborne@epimorphics.com>
Message-ID: <1316790827.9119.9.camel@waldron>
On Thu, 2011-09-22 at 10:40 -0400, Satya Sahoo wrote:
> Hi Sandro and Graham,
> 
> On Thu, Sep 22, 2011 at 2:09 AM, Graham Klyne <GK@ninebynine.org>
> wrote:
>         On 21/09/2011 23:27, Sandro Hawke wrote:
>                 [cc'ing public-prov-wg, but reply-to set to
>                 public-rdf-prov, where as I
>                 understand it this discussion should proceed.]
>                 
>                 On Wed, 2011-09-21 at 16:49 +0100, Graham Klyne wrote:
>                         ..........
>  
>                         might be considering.
>                         
>                         The resulting requirement that we articulated
>                         was that for the purposes of
>                         provenance, we must be able to treat two
>                         "named" graphs with identical graph
>                         content as two distinct entities.
>                         
>                         ...
>                         
>                         The use-case is this:
>                         
>                         Suppose we have some resource R.
>                         
>                         Observer A makes a provenance assertion about
>                         R on Monday 2011-09-19, which is
>                         expressed as an RDF graph Pra
>                         
>                         Observer B makes a provenance assertion about
>                         R on Friday 2011-09-23, expressed
>                         as RDF graph Prb
>                         
>                         To express provenance about the provenance
>                         assertions, we may wish to say:
>                         
>                         Pra statedBy A; onDate "2011-09-19" .
>                         
>                         Prb statedBy B; onDate "2011-09-23" .
>                         
>                         It may be that the provenance assertions Pra
>                         and Prb have identical content;
>                         i.e. they are RDFG graphs containing identical
>                         triple sets.  For the purposes of
>                         provenance recording, it is important that
>                         even when they express the same
>                         graphs, Pra and Prb are distinct RDF nodes.
>                          If Pra and Prb are treated as a
>                         common RDF node, one might then infer:
>                         
>                         _:something statedBy A ; onDate "2011-09-23" .
>                         
>                         which in this scenario would be false.
>                 
>                 Isn't this just some modeling confusion?  Pra and Prb
>                 are each an event
>                 of the authoring of a statement (ie they are
>                 "statings"), not
>                 statements, based on how you use them, giving them
>                 dates and such.
>                 Those two statings can then be connected to the same
>                 statement (graph,
>                 G1).
>                 
>                 Pra statement G1; dc:creator A; dc:date "2011-09-19".
>                 Prb statement G1; dc:creator B; dc:date "2011-09-23".
>                 
>                 Then G1 can just be a literal, or whatever we use to
>                 allow conversations
>                 about g-snaps.
>         
>         
>         Yes, this is indeed a modelling choice.  As I note later, I
>         wasn't dismissing the graph literal approach, just trying be
>         expose the need, which might be addressed in different ways.
> 
> 
> I am a bit confused here - originally Graham used Pra and Prb as "an
> RDF graph", so they can be treated as resources and we can make
> statements about it - creator and date. 

I don't agree.   RDF graphs are pure mathematical objects, like the
number 7.   It doesn't make sense to give the number 7 a date.   When he
gave it a date, it became clear to me that he had to be talking about
something else.

> Also, the "event" will be the equivalent to a PE "statementMaking" and
> the output will be Pra and Prb. 

Sorry, I'm missing some context here, but "statementMaking" sounds right
for what Pra and Prb are -- there the event of making a statement, not
the statement (graph) itself.

> I agree with Sandro that Graham's original use of Pra and Prb maps
> them to g-snaps from RDF WG terminology. Looking at the definitions of
> g-box and g-snap, they almost seems to be a class-instance
> correspondence between them?

Hmmm, I don't see anything class/instance like.    In programming terms,
a g-snap is like the number seven and a g-box is like a memory location
which might, at some point in time, hold a bit pattern representing the
number seven.   Or a g-snap is like a words "Back in 5 minutes" and a
g-box is like a door where you sometimes hang a sign with those words.

We need both these concepts because URLs name g-boxes but the
information we actually see and work with is packaged into g-snaps (or
serializations of g-snaps). 

    -- Sandro

>         
>                 
>                         .....
>                         
>                         A particular consequence of this is that an
>                         RDF "named graph" specification
>                         based on graph literals (where RDF literals
>                         are self-denoting), somewhat like
>                         formulae in Notation 3, would have to be used
>                         with care.  That is, if Pra and
>                         Prb are graph literals, then Pra = Prb, and
>                         the given provenance-of-provenance
>                         statements could not be expressed as suggested
>                         above.
>                 
>                 It seems to me Pra and Prb are not g-snaps (RDF
>                 graphs), so there's no
>                 problem here.
>         
>         
>         Again, an interaction of modelling and design choices.  There
>         *could* be a problem, depending on what choices are made.
> 
> 
> Again, Pra and Prb seem to fit the definition of g-snap?
> 
> 
> Best,
> Satya
>  
>         
>         #g
>         --
>         
>         
>         
>                         (This does not preclude a graph literal
>                         approach being used, but the above
>                         use-case might need to be constructed slightly
>                         differently.)
>                         
>                         #g
>                         --
>                         
>                         
>                 
>                 
>         
>         
> 
>
Received on Friday, 23 September 2011 15:14:03 UTC