how we refer to both g-boxes and g-snaps from Sandro Hawke on 2012-05-29 (public-rdf-wg@w3.org from May 2012)

From: Sandro Hawke <sandro@w3.org>
Date: Tue, 29 May 2012 19:27:42 -0400
To: Pat Hayes <phayes@ihmc.us>
Cc: Richard Cyganiak <richard@cyganiak.de>, Yves Raimond <Yves.Raimond@bbc.co.uk>, RDF Working Group WG <public-rdf-wg@w3.org>
Message-ID: <1338334062.2332.115.camel@waldron>
On Wed, 2012-05-23 at 13:41 -0500, Pat Hayes wrote:
> Richard, I am confused. 
> 
> Sometimes I get the sense that you want the graph names to refer not to graphs as such, but rather to 'stateful resources' (or whatever) which have a robust identity and emit graphs when poked, a REST-inspired kind of a thing.. (Cf. your responses on other threads.) At other times, however (as here) you seem to want the graph names to refer to an actual set of triples, a true Platonic RDF graph.
> 
> It really does matter which we choose, and I don't see how we can choose both (or not without a lot of new machinery to make the distinction, that we have not even discussed yet) and I don't think it is viable to just be muddled or ambiguous about it, as that is the muddle we are in already and are trying to get straight. 
> 
> For example, if the graph names refer to stateful resources, then there are two rather different ways to identify a subgraph or a larger graph. ONe is to speak of a subset (defined somehow) of the graph that is the current state of the stateful resource, the other is to have a relation between two resources such that one returns a subset of what the other returns, at any time. These behave differently and would need to be implemented differently. 
> 
> I have no axe to grind here. I would be quite happy if we were to declare that graph names in datasets always refer to stateful resources. I would also be happy if we decide they always refer to graphs. But I am not happy about it being ambiguous or undecided. I do feel that it is very important that we choose one story and stick to it. Which one do you want to pitch for?

I think Richard replied to this well, but since you haven't replied to
that (and shown you understand), let me answer in my own way.  I believe
I'm agreeing with Richard on the substance, but perhaps thinking about
it quite differently.

The answer is: we're being a bit tricky, so that we can have our graphs
and eat them, too, so to speak.

We'd like to be able to refer to g-snaps AND we'd also like to be able
to refer g-boxes.  (I'm staying out of the source/resource/space/etc
discussion for now.  I think I can live with any of the names that have
been proposed.)   We do this by defining the semantics of datasets such
that the graph names refer to g-boxes, and let the way they are used
indicated whether/how the associated g-snaps are to actually be used.

For example, in my implementation of use case 2 (simple web provenance),
the aggregated phone book looks like this:

  :corp :hasDivision :div1, :div2, ...
  :div1 :hasFeed <div1url>.
  :div2 :hasFeed <div2url>.
  ...
  <div1url> { ... triples fetched  ... }
  <div2url> { ... triples fetched  ... }

Here, "div1url" is the working HTTP URL which HQ uses periodically to
get an updated copy of the Division 1's employee directory, building
this pseudo-trig file.

The definition of :hasFeed is where things all come together.  I use it
with a meaning like this:

                ?subj :hasFeed ?obj 
                
        means
        
                ?subj is a social entity, such as a person or department

                *if* a successful dereference of any IRI which
                denotes ?obj returns an RDF Graph serialization, then
                the serialized graph is considered by ?subj to be valid
                data.  
                
                *if* any IRI which denotes ?obj occurs as the name in a
                (name, graph) pair in a valid dataset, then the graph
                part of that pair is considered by ?subj to be valid
                data.
                
I think this covers both the current sometimes-odd SPARQL deployments
and the linked data/web-centric deployments.  It's the kind of
minimally-restrictive solution that I think Richard has been arguing
for.  It needs *very* little in the semantics of datasets.

In a sense it doesn't need anything, but I'd like to factor out 90% of
that definition of :hasFeed, since every predicate that I use with
dataset graph names has that same text.

I've worked my way through the use cases I put in rdf-spaces like this,
and it seems to be working fine.    I have come up with a few more use
cases in doing so, though.   And I definitely want some syntactic sugar
that trig doesn't offer.    (Should we still call it trig if we get rid
of the braces around the default graph to make it an extension of
Turtle, or should we give it another name?)

   -- Sandro
Received on Tuesday, 29 May 2012 23:27:49 UTC