W3C home > Mailing lists > Public > public-rdf-wg@w3.org > October 2011

Re: why I don't like named graph IRIs in the DATASET proposal

From: Pat Hayes <phayes@ihmc.us>
Date: Sun, 2 Oct 2011 20:40:27 -0500
Cc: "public-rdf-wg@w3.org Group WG" <public-rdf-wg@w3.org>, Pierre-Antoine Champin <pierre-antoine.champin@liris.cnrs.fr>
Message-Id: <A749CF5D-1393-4107-A24E-BF4B51F40BB8@ihmc.us>
To: Richard Cyganiak <richard@cyganiak.de>

On Sep 30, 2011, at 7:49 AM, Pierre-Antoine Champin wrote:

> Richard,
> 
> On 09/30/2011 12:02 PM, Richard Cyganiak wrote:
>> On 29 Sep 2011, at 17:31, Pierre-Antoine Champin wrote:
>>> SPARQL states that:
>>>> An RDF Dataset comprises one graph, the default graph, which does
>>>> not have a name, and zero or more named graphs, where each named
>>>> graph is identified by an IRI.
>> 
>> Well that's SPARQL. We are talking about RDF Concepts. It says [1]:
>> 
>> [[
>> Each named graph is a pair consisting of an IRI (the graph name), and an RDF graph. Graph names are unique within an RDF dataset.
>> ]]
> 
> sorry, I didn't notice that you rephrased it.

I (strongly) suggest that we avoid the use of the word "name" in this at all, as naming/reference/denotation are widely treated as synonyms, and I believe this usage is intended in the SPARQL documents also. How about "graph label"? Replace name//label and named//labelled in the above, and then clarify as follows:

Note: the use of an IRI as a graph label in an RDF dataset does not imply that the IRI names,  identifies or denotes the graph in question. A graph label IRI may be used in an RDF triple to refer to something other than the graph. This can be true even for triples in graphs in the RDF dataset.

Pat


> 
>> It avoids words like “identify” and “denote”.
> 
> And with very good reasons.
> However, IRIs in RDF have been traditionnaly used to denote resource, so
> even if you refrain from using those words, it is very easy for the
> reader to see them anyway.
> 
> So I suggest that omitting those words is not sufficient. The definition
> should be followed by a warning, e.g.:
> 
>  Note that : graph names in a dataset are not used to denote the graph
>  in the way an IRI node denotes a resource.
> 
>>> So I would argue that, in the end of the day, neither of the following
>>> sentence is accurate:
>>> 
>>> a named graph is identified by an IRI
>>> a named graph is labeled by an IRI
>>> 
>>> but in fact:
>>> 
>>> a named graph is labelled by a resource
>> 
>> That's not accurate at all.
> 
> Well, take example 1 from
> http://www.w3.org/TR/rdf-sparql-query/#exampleDatasets
> which is supposed "to have information in the default graph that
> includes provenance information about the named graphs"
> 
> The default graph contains:
> 
>  <http://example.org/bob>    dc:publisher  "Bob" .
> 
> which means that the *resource denoted* by <http://example.org/bob> is
> related to the string "Bob" by the relation denoted by predicate
> dc:publisher. It is *not* the IRI "http://example.org/bob" which is
> related to "Bob", but a *resource*. If I knew another IRI for that
> resource, I could rewrite that triple
> 
>  <http://example.other.com/bob>  dc:publisher  "Bob" .
> 
> without changing the meaning of that triple in any way.
> 
> 
> So the only way for this triple to provide information about a graph in
> the dataset is that the graph be in fact associated with the *resource*
> and not the IRI.
> 
> 
> Of course, all this derives from examples in the SPARQL document, not
> the Dataset definition in your ED. However, your argument in favor of
> the dataset proposal was to reuse something known to work, rather than
> reinventing it. My point is:
> 
> * the SPARQL definition has some theoretical caveats
> * rephrasing the definition as you did may in principle solve this, but
> does not remove the risk of confusion, because
>  * IRIs are used differently for resources and for graphs
>  * an SPARQL fuels the confusion by using the same syntax (<>
>    brackets) for both IRI nodes and graph names
> 
> pa
> 
> 
>> A named graph is an <IRI,graph> pair. The IRI is called the graph name.
>> 
>> As written in the ED, the relationship between the IRI and the graph is neither “identifies” nor “labels”; it is “is graph name of”. No relationship between the resource denoted by the IRI and the graph is implied by the wording in the ED.
>> 
>>> (imagine for example a owl:sameAs statement between two graphs IRI in a
>>> SPARQL engine supporting OWL inference; what would that mean?)
>> 
>> owl:sameAs means that two terms denote the same resource. As written in the ED, use of those terms as graph names is entirely orthogonal to that.
>> 
>> I think that's a good thing. Named graphs are key to trust and provenance. Trust and provenance must happen at a lower level in the stack, before reasoning and inference kick in. W3C's version of the layer cake, where trust sits above reasoning, cannot work. The moment you reason with OWL over untrusted data, you're fucked.
>> 
>> Best,
>> Richard
>> 
>> [1] http://dvcs.w3.org/hg/rdf/raw-file/default/rdf-concepts/index.html#section-multigraph
> 
> 
> 

------------------------------------------------------------
IHMC                                     (850)434 8903 or (650)494 3973   
40 South Alcaniz St.           (850)202 4416   office
Pensacola                            (850)202 4440   fax
FL 32502                              (850)291 0667   mobile
phayesAT-SIGNihmc.us       http://www.ihmc.us/users/phayes
Received on Monday, 3 October 2011 01:40:55 GMT

This archive was generated by hypermail 2.3.1 : Tuesday, 26 March 2013 16:25:45 GMT