Re: why I don't like named graph IRIs in the DATASET proposal

Richard,

On 09/30/2011 12:02 PM, Richard Cyganiak wrote:
> On 29 Sep 2011, at 17:31, Pierre-Antoine Champin wrote:
>> SPARQL states that:
>>> An RDF Dataset comprises one graph, the default graph, which does
>>> not have a name, and zero or more named graphs, where each named
>>> graph is identified by an IRI.
> 
> Well that's SPARQL. We are talking about RDF Concepts. It says [1]:
> 
> [[
> Each named graph is a pair consisting of an IRI (the graph name), and an RDF graph. Graph names are unique within an RDF dataset.
> ]]

sorry, I didn't notice that you rephrased it.

> It avoids words like “identify” and “denote”.

And with very good reasons.
However, IRIs in RDF have been traditionnaly used to denote resource, so
even if you refrain from using those words, it is very easy for the
reader to see them anyway.

So I suggest that omitting those words is not sufficient. The definition
should be followed by a warning, e.g.:

  Note that : graph names in a dataset are not used to denote the graph
  in the way an IRI node denotes a resource.

>> So I would argue that, in the end of the day, neither of the following
>> sentence is accurate:
>>
>>  a named graph is identified by an IRI
>>  a named graph is labeled by an IRI
>>
>> but in fact:
>>
>>  a named graph is labelled by a resource
> 
> That's not accurate at all.

Well, take example 1 from
http://www.w3.org/TR/rdf-sparql-query/#exampleDatasets
which is supposed "to have information in the default graph that
includes provenance information about the named graphs"

The default graph contains:

  <http://example.org/bob>    dc:publisher  "Bob" .

which means that the *resource denoted* by <http://example.org/bob> is
related to the string "Bob" by the relation denoted by predicate
dc:publisher. It is *not* the IRI "http://example.org/bob" which is
related to "Bob", but a *resource*. If I knew another IRI for that
resource, I could rewrite that triple

  <http://example.other.com/bob>  dc:publisher  "Bob" .

without changing the meaning of that triple in any way.


So the only way for this triple to provide information about a graph in
the dataset is that the graph be in fact associated with the *resource*
and not the IRI.


Of course, all this derives from examples in the SPARQL document, not
the Dataset definition in your ED. However, your argument in favor of
the dataset proposal was to reuse something known to work, rather than
reinventing it. My point is:

* the SPARQL definition has some theoretical caveats
* rephrasing the definition as you did may in principle solve this, but
does not remove the risk of confusion, because
  * IRIs are used differently for resources and for graphs
  * an SPARQL fuels the confusion by using the same syntax (<>
    brackets) for both IRI nodes and graph names

 pa


> A named graph is an <IRI,graph> pair. The IRI is called the graph name.
> 
> As written in the ED, the relationship between the IRI and the graph is neither “identifies” nor “labels”; it is “is graph name of”. No relationship between the resource denoted by the IRI and the graph is implied by the wording in the ED.
> 
>> (imagine for example a owl:sameAs statement between two graphs IRI in a
>> SPARQL engine supporting OWL inference; what would that mean?)
> 
> owl:sameAs means that two terms denote the same resource. As written in the ED, use of those terms as graph names is entirely orthogonal to that.
> 
> I think that's a good thing. Named graphs are key to trust and provenance. Trust and provenance must happen at a lower level in the stack, before reasoning and inference kick in. W3C's version of the layer cake, where trust sits above reasoning, cannot work. The moment you reason with OWL over untrusted data, you're fucked.
> 
> Best,
> Richard
> 
> [1] http://dvcs.w3.org/hg/rdf/raw-file/default/rdf-concepts/index.html#section-multigraph

Received on Friday, 30 September 2011 12:49:58 UTC