Re: Text for clarification of re-use of IRIs in dataset clauses from Seaborne, Andy on 2007-10-15 (public-rdf-dawg@w3.org from October to December 2007)

From: Seaborne, Andy <andy.seaborne@hp.com>
Date: Mon, 15 Oct 2007 20:54:36 +0100
To: ogbujic@ccf.org
Cc: public-rdf-dawg@w3.org
Message-ID: <4713C57C.6010105@hp.com>
Chimezie Ogbuji wrote:
> Andy, my comments are inline below.
> 
> On Mon, 2007-10-15 at 15:32 +0100, Seaborne, Andy wrote:
>> Ogbuji, Chimezie wrote:
>>> [[
>>> The FROM NAMED syntax suggests that the IRI identifies the corresponding 
>>> graph, but the relationship between an IRI and a graph in an RDF dataset 
>>> is indirect. The IRI identifies a resource, and the resource is 
>>> represented by a graph (or, more precisely: by a document that 
>>> serializes a graph). The relationship between the IRI and the 
>>> representation is subject to time, an intermediate caching policy, the 
>>> query service, and the mechanics of the underlying transport protocol.  
>>> For further details see [WEBARCH]. 
>>>
>>> The distinction between a surface RDF notation and the abstract RDF 
>>> graph which results from parsing an instance of the surface notation is 
>>> an additional indirection.  As a consequence of these things, the 
>>> repeated use of an IRI in either the same dataset clause, across dataset 
>>> clauses, or across whole SPARQL queries can feasibly result in either 
>>> the formulation of a single canonical graph, separate but isometric 
>>> graphs, or completely disjoint [2] RDF graphs for each use of the same IRI.
>>> ]]
>> Chimezie,
>>
>> This text introduces some new terminology - I only found one reference on the 
>> web to "surface notation" in the context of RDF (a note by Pat).
> 
> Yes, that was the only source for that term.  How about instead:
> 
> s/surface notation/RDF graph serialization
> 
>> Isn't the indirection due to the mention of the IRI twice and the use of a 
>> graph in the dataset.
> 
> I'm not sure what you mean by the "use" of a graph in the dataset, could
> you clarify the second part of that sentence?

"use" as in use/mention.

 > In any case, the
> indirection is three-fold (it spans web architecture and the
> concrete/abstract RDF syntax divide).  i.e.:
> 
> IRI -> RDF "information resource" -> RDF graph representation -> RDF
> abstract graph

OK - got it.

> 
> I was trying to be explicit about the nature of this indirection so as
> to cover all cases where this is relevant not just the situation that
> motivated the clarification (i.e., the dataset tests and the assumption
> about distinct BNodes across graphs formed from the same IRI)
> 
>> How about: for 8.2.3:
>> [[
>> The actions required to construct the dataset are not determined by the 
>> dataset description.  If an IRI is given twice in an dataset description, 
>> either by using two FROM clauses, or a FROM clause and a FROM NAMED clause, 
>> then it does not assume that exactly one or exactly two attempts are made to 
>> obtain an RDF graph associated with the IRI.  Therefore, no assumptions can be 
>> made about blank node identity in triples obtained from the two occurrences in 
>> the dataset description.
>> ]]
>> 	Andy
> 
> I was hoping that we could be a little more specific than that - without
> risking the introduction of concepts that are not already covered by our
> normative dependencies.  The interplay between web architecture and the
> formulation of the dataset (in this case) is the crucial bit.  In
> addition, I didn't call out the blank node identity scenario because I
> got the impression that you were concerned about covering the general
> case.  I'm not sure how to reconcile the larger picture with your
> suggested text above, but below is an attempt:
> 
> [[
> The actions required to construct the dataset are not determined by the
> dataset description alone.  If an IRI is given twice in an dataset
> description, either by using two FROM clauses, or a FROM clause and a
> FROM NAMED clause, then it does not assume that exactly one or exactly
> two attempts are made to obtain an RDF graph associated with the IRI.
> Therefore, no assumptions can be made about blank node identity in
> triples obtained from the two occurrences in the dataset description.
> In general, no assumptions can be made about the isomorphism of the
> formulated graph.
> ]]

We're agreed on the text up to the last sentence.

If you read the same IRI, get different "information resources", then all bets 
are off anyway - they could say different things (updates may have happened, 
different aspects of the concept there might be revealed) so it's not just 
isomorphism.  How about:

"In general, no assumptions can be made about the equivalence of the graphs."

(Interestingly, I found from Google hits on old WDs, that section in RDF 
Concepts changed from "Graph Equality" to "Graph Equivalence").

	Andy

-- 
  Hewlett-Packard Limited
  Registered Office: Cain Road, Bracknell, Berks RG12 1HN
  Registered No: 690597 England
Received on Monday, 15 October 2007 19:54:58 UTC