- From: Andy Seaborne <andy.seaborne@epimorphics.com>
- Date: Fri, 30 Sep 2011 11:16:49 +0100
- To: RDF-WG <public-rdf-wg@w3.org>
The area of RDF datasets in the last SPARQL Working group (DAWG) was controversial and took a long time. When writing about what happened, I am only saying what I remember of the debates. The compromise position reached is proably something where everyone actively involved had to give up something they considered important. There was already a significant amount of prior implementation so people had a vested interest in the outcome. This is not a bad thing. Earlier in this (RDF) WG, we began talking about he URI being "associated" with the graph, and leaving that "association" open. DanC came up with this text: [[ The FROM NAMED syntax suggests that the IRI identifies the corresponding graph, but the relationship between an IRI and a graph in an RDF dataset is indirect. The IRI identifies a resource, and the resource is represented by a graph (or, more precisely: by a document that serializes a graph). For further details see [WEBARCH]. ]] The Named Graph paper (Carroll, Bizer, Hayes, Stickler), to my reading, says that a name refers to the g-snap. But it then does not provide concrete examples of the naming. All the graphs are ":G1" etc. and does not give prefix definitions. So what makes a good name? Let <http://www.server.net/resource> be an IR whose representation is a serialization of an RDF graph. It's a g-box. That's not the g-snap so name that <http://example.org/a_graph> ... but that is a location on the web, and being HTTP, you should be able to GET from it. The g-snap isn't "on the web" maybe <http://example.org/ns#graph1> is better. I find that using an non-resolvable URI is better here: I want to name the g-snap not put the g-snap on the web: <uuid:2dc1a4c6-eb46-11e0-869e-485b397edc67> <tag:example.org,2011-10:graph1> RDF datasets can be used for this detailed tracking of the state of part of the web by careful choice of the graph URIs. In an RDF dataset the app can record the state of <http://www.server.net/resource> at different times using different URIs for different times when the app does a GET. This works nicely with the default graph including the manifest of the named graph - when they were read, from where, etc etc. While this is my description, it's my understanding of what 3Store was doing, except tit used bNodes for the graph identification. A common UC is wanting a copy of a remote graph, not worrying about the fact it might change (e.g. only the latest matters or it is to be considered unchanging). Making the URI associated with the graph the place it comes from is easily comprehensible. Application writers understand this viewpoint. One setup of an RDF dataset is to make the default graph as the union of named graphs in the dataset is a common usage - some systems only offer this mode of operation for RDF datasets. Yes - this use of the g-box URI for the graph URI is a shortcut. But to argue for having to have the proper machinery that provides no perceived value to the app writer and adds to the cost/complexity is not an argument that is going to won very often. The NGs of the "Named graph" paper didn't quite make it to the general "named graphs" for many people. A different point of view when DAWG was debating multi-graph was the idea that the 4th field was a "context" for a triple. All the triples were part of the same graph, but the triples were labelled to group them. The app could ask "where did this triple come from?" or "which triples came from X?". While this is a different way of thinking, coming from a particular and important class of applications, it is covered by the default=union usage of RDF datasets if context is a URI In n-Quads that's not required - it can be a literal or bnode context ::= uriref | nodeID | literal But the most important use case for SPARQL is querying one graph, no named graphs. It's a progression from there - start with no GRAPH, add in other graph with URIs. It's not a single complex concept that every app writer has to understand before using SPARQL at all. I don't think collections of graph are the fundamental building block for the semantic web. Graphs (and triples and URIs) are the the building block. We can either rework the semweb stack to make collections of named graph fundamental, or introduce an intentionally secondary concept. Andy Aside: even for graph literals, naming can be important. Graph literals aren't small, so having to send the serialization around just to talk about it has practical implications. And equality is akin to XMLLiterals.
Received on Friday, 30 September 2011 10:17:20 UTC