TriG document examples

"""
Create a short example for a TriG document and a clear notion of what is 
entailed by it
"""

Possibly falling on the "short", because we haven't had much discussion 
in the last few weeks ...

 Andy


== Case 1: This is not a TriG document
Case 1 is here because I want to talk about graph literals further on.

:doc1 :asserts { :y1 :name "foo" } .
:doc2 :asserts { :y1 :name "foo" } .

Pro:
Purest form of graphs-as-values

Con:
Referring to the same value from two places requires the literal to be 
written twice.  For integers, that's no big deal; for a graph literal 
it's a bit of a nuisance, even for a graph of say 10 triples.

Danger of requiring widespread re-engineering (parsers, storages, other 
specs).

Does not mention the web - it does not talk about g-boxes or accessing 
them).  A consequence of "no web" is that additional layers of modelling 
are needed for talking about the content of a web page etc, yet this is 
the most significant use of multi-graph datasets.


== Case 2: Named values

The case uses names for graph literals so that the same literal can be 
used in two places (by reference).
-----------------------------------
{ # Default graph
   :doc1 :asserts :graphLiteral1 .
   :doc2 :asserts :graphLiteral1 .
}

:graphLiteral1 { :y1 :name "foo" } .
-----------------------------------

entails

:graphLiteral1 :denotesLiteral [] .
:graphLiteral1 a rdf:Literal .
:graphLiteral1 a rdf:LiteralGraph .
-----------------------------------

The relationship between :graphLiteral1 and { :y1 :name "foo" } is much 
like owl:sameAs except between an individual and a value.  It need not 
be symmetric in IRI/value.

Pros:
We can now refer to large graph literals and not need to repeat them.

Cons:
Still requires additional machinery to be used for common cases. Such 
machinery is additional complexity to the data publisher and data user.


== Case 3a: Explicit event modelling
See also "Time-varying g-boxes : a dataset pattern"
http://lists.w3.org/Archives/Public/public-rdf-wg/2011Oct/0148.html
and also the idea of "event modelling"

One way to "add the web" is to identify the fundamental concepts of the 
web.  We already have "naming" in case 2, the other is dereference.

REST does not give a name to the act of dereference but we can do that. 
  The relationship of name and graph value is one of "snapshot" or "seen".

-----------------------------------
{ # Default graph
   # Adds details to the observation.
   # Does not need to be in the default graph, or even in the dataset
   # but using the default graph as the manifest is natural.

   :doc1 :asserts :graph1 .
   :doc2 :asserts :graph1 .
   :graph1 rdfs:value uuid:187 .
   :doc1 :accessedAt "2011-12-11T20:30:07+00:00"^^xsd:dateTime .
   :doc2 :accessedAt "2011-12-12T13:45:57+00:00"^^xsd:dateTime .
}

uuid:187 { :y1 :name "foo" } .
-----------------------------------

so entailed

# If RDF had syntax for graph literals.
uuid:187 rdf:graphSnapshot  { :y1 :name "foo" }

uuid:187 rdf:graphSnapshot
      "{ :y1 :name "foo" }"^^rdf:graphSyntaxTurtle .

uuid:187  a  rdf:observation .

schema:

rdf:graphSnapshot rdfs:range rdfs:Resource .
rdf:graphSnapshot rdfs:domain rdf:GraphLiteral .

-----------------------------------

Pros:
We can now talk about what's in a document (at a point in time).

Cons:
While we have now captured all the information, it does not support the 
common use case of one value per g-box being relevant so it is machinery 
but only partial.  I think that it would require the world to change how 
it is using named graphs currently.

In other words, the "value in the g-box" case is so common it deserves 
making easy to use.

== Case 3b: Named versions
The "Rolling Snapshots" Pattern and Vocabulary
http://lists.w3.org/Archives/Public/public-rdf-wg/2011Oct/0152.html

Case 3a puts the modelling onto the dataset.
Case 3b is supported by the way the publisher publishes the data.
rdf:seenToHaveValue rdfs:domain web:location .

Every version of the graph in a g-box is given a distinct URI by the 
publisher.

No trig document.

But you might have a graph that describes the observations:

{
   :doc1 rdf:versionedValue :doc1-v1 .
   :doc1 rdf:versionedValue :doc1-v2 .

   :doc1-v1 rdf:validAt "2011-12-11T20:30:07+00:00"^^xsd:dateTime .
   :doc1-v2 rdf:validAt "2011-12-12T13:45:57+00:00"^^xsd:dateTime .
}


== Case 4: "value in the g-box"
c.f. Jeremy's web cache


A common case is where the g-box is only accessed once - a collection of 
graph values are assembled for the purposes of a query.  The graph 
values have an associated URI (not denoting it).

As a query is a one time action, the idea that there is one value for 
each location ("current value") is natural.

(This is the restricted form of RDF dataset that is captured by SPARQL 
FROM NAMED but not the only RDF dataset style possible as allowed by the 
machinery of SPARQL.  FROM NAMED makes the restriction that location 
accessed and graph name are the same.)

-----------------------------------

{ # Possible default graph
   # Include info of when :doc1 and :doc2 were accessed
   # but that puts back the even of dereferencing.
}

:doc1 { :y1 :name "foo" } .
:doc2 { :y1 :name "foo" } .

-----------------------------------
entails:

:doc1 rdf:seenToHaveValue { :y1 :name "foo" }  .
:doc1 a rdf:g-box .

and schema:

rdf:seenToHaveValue rdfs:domain web:location .
rdf:seenToHaveValue rdfs:range rdf:GraphLiteral .

-----------------------------------

== Case 5: Use of foaf:primaryTopic

Some systems use the primary topic of a graph as the graph name.

I'm not sure if any of these are contenders for our named graph - it 
would seem to require a lot of new machinery to get the punning right.

Received on Monday, 12 December 2011 14:06:41 UTC