why I don't like named graph IRIs in the DATASET proposal from Pierre-Antoine Champin on 2011-09-29 (public-rdf-wg@w3.org from September 2011)

From: Pierre-Antoine Champin <pierre-antoine.champin@liris.cnrs.fr>
Date: Thu, 29 Sep 2011 17:31:57 +0200
To: "public-rdf-wg@w3.org" <public-rdf-wg@w3.org>
Message-ID: <4E848F6D.9030608@liris.cnrs.fr>

Hi all,

as Richard asked me during the telecon of 09-29, I'll try to pinpoint
what bothers me in the SPARQL DATASET proposal.
(this is part 2: part 1 was about the default graph, see other mail)

One such point is the relation between named graphs and their URI (or
IRI, for that matter).

SPARQL states that:
> An RDF Dataset comprises one graph, the default graph, which does
> not have a name, and zero or more named graphs, where each named
> graph is identified by an IRI.

On the other hand, RDF Concepts states that:
> An IRI (...) used as a node identifies what that node represents.

You can rephrase those sentences, respectively as:
* a IRI identifies a named graph
* a IRI identifies a resource

My problem here is that the word "identifies" has a completely different
meaning in those sentences. Indeed, the following is, I think, a usual
pattern in SPARQL (using Trig to represent the named graph):

  <http://example.org/alice>
    {
       <http://example.org/alice>
         a foaf:Person ;
         foaf:name "Alice" ;
         foaf:mbox <mailto:alice@work.example.org> .
    }

Obviously, <http://example.org/alice> does not "identify" a graph and a
person in the same way.

I see two ways out of this problem:

1) either we force the IRI of a named graph to actually *name* that
graph (in the model theoretic way), but we then depart from SPARQL
DATASETs and widespread use;

2) or we rephrase the DATASET definition and make it very clear the the
named graph IRI is a mere label, and not an *name* in the model
theoretic sense.

What still bothers me with the option 2) is that, in SPARQL or Trig,
those graph labels are syntactically homogeneous to an IRI *node*.

To illustrates why it bothers me, let me just propose the two following
statements:

  _:a_graph ns:label <http://example.org/alice> .

vs.

  _:a_graph ns:label "http://example.org/alice" .

So I would argue that, in the end of the day, neither of the following
sentence is accurate:

  a named graph is identified by an IRI
  a named graph is labeled by an IRI

but in fact:

  a named graph is labelled by a resource

I'm not saying this is bad, I'm just saying this is where we are aiming
with the second option 2), and we should carefully weight the consequences.

(imagine for example a owl:sameAs statement between two graphs IRI in a
SPARQL engine supporting OWL inference; what would that mean?)

    pa

Received on Thursday, 29 September 2011 15:32:42 UTC