modeling SPARQL update - a GRAPHS challenge (test case?)

[written right after the meeting; mail blocked because the attached
photo was too large.  I've moved it to the website instead.]

After the F2F2 ended formally, those of us at MIT chatted for a few more
hours.   Thinking about Ian's unwillingness to give up the RDF
reification vocabulary made me wonder if we shouldn't try to redeem it.
It doesn't have any semantics in the current specs, so that may give us
freedom to give it some.

Here's an example of what it might look like, in the form of a specific
tricky use case.   We drew this on the whiteboard [1].  People seemed
to think it would be interesting to see how others might solve this use
case, perhaps not using reification.

Scenario:

   Assume the prefix eg is bound to "http://example.com/".  

   We have a SPARQL end point at eg:server.  At 5pm today it has the
   following dataset:

       eg:page1 { eg:subj eg:pred 1. }

   At 5:01pm, something happens (perhaps an UPDATE is done) and it now
   has a different dataset:

       eg:page1 { eg:subj eg:pred 2. }

The challenge is to express this information in a formal language.

The solution I sketched on the whiteboard addresses the challenge using
Turtle and a proposed reification vocabulary.  The reification vocab
namespace prefix is r: (and might merge into rdf:).  I'm also using h:
for this particular history, and you could replace "h:" with "_:".
Also, p: is for provenance/state change information -- perhaps more the
domain of the Provenance WG.

   ### PART 1 -- The Graph Snapshots

   # the empty graph
   h:g0 a r:GraphSnapshot;
          r:statementList ( ).

   # our graph at t=5:00
   h:g1 a r:GraphSnapshot;
          r:statementList 
              ( [ a            rdf:Statement;
                 rdf:subject   eg:subj;
          rdf:predicate eg:pred;
          rdf:object    1. ]
              ) ].

   # our graph at t=5:01
   h:g2 a r:GraphSnapshot;
          r:statementList 
              ( [ a            rdf:Statement;
                 rdf:subject   eg:subj;
          rdf:predicate eg:pred;
          rdf:object    2. ]
              ) ].

   ### PART 2 - The Dataset Snapshots

   h:d1 a r:Dataset;   
        r:defaultGraph h:g0;
        r:graphList ( h:g1 ).

   [ a NameBinding;
     inDataset h:d1
     graph h:g1
     name "http://example.com/page1"
   ].

   h:d2 a r:Dataset;   
        r:defaultGraph h:g0;
        r:graphList ( h:g2 ).

   [ a NameBinding;
     inDataset h:d2
     graph h:g2
     name "http://example.com/page1"
   ].

   ### PART 3 - Connecting the SPARQL Server with Datasets
   # (we didn't talk about this specifically, during the after-meeting,
   # so this is just my strawman).

   eg:server p:datasetActivation [
         p:dataset h:d1;
  # we weren't told the start time, so we'll use the earliest
         # known time -- that says nothing about earlier times
         p:begins  "2011-10-13T17:00"^^xs:dateTime;
  # p:ends is up-to-but-not-including the given instant
         p:ends    "2011-10-13T17:01"^^xs:dateTime;
   ].
   eg:server p:datasetActivation [
         p:dataset h:d2;
         s:begins "2011-10-13T17:01"^^xs:dateTime;
  # we'll leave out the ending time, as unknown.
   ].
                 
As I write this, using turtle lists instead of circles and arrows, I
think a much simpler modeling for PART 2 would be:

   h:d1 a r:Dataset;   
        r:defaultGraph h:g0;
        r:graphBindings ( ("http://example.com/page1" h:g1) )

   h:d2 a r:Dataset;   
        r:defaultGraph h:g0;
        r:graphBindings ( ("http://example.com/page1" h:g2) )

or maybe:

   h:d1 a r:Dataset;   
        r:defaultGraph h:g0;
        r:graphBindings ( [ a r:nameBinding;
                            r:name "http://example.com/page1";
                            r:snap h:g1] )

   h:d2 a r:Dataset;   
        r:defaultGraph h:g0;
        r:graphBindings ( [ a r:nameBinding;
                            r:name "http://example.com/page1";
                            r:snap h:g2] )

Note that I think it's important to use a list instead of a repeated
property, so we know this is the ONLY graph in the dataset.

Graph literals, of some sort, would make PART 1 nicer.  TriG with the
iri=g-snap relation would, too.   Maybe TriG with static g-box could,
too.  Someone want to try to work out the details of that?

   -- Sandro

[1] http://www.w3.org/2011/rdf-wg/IMG_20111013_161407.jpg

Received on Friday, 14 October 2011 12:46:29 UTC