Re: “Default” vs “unnamed” graph from Lee Feigenbaum on 2011-03-16 (public-rdb2rdf-wg@w3.org from March 2011)

From: Lee Feigenbaum <lee@thefigtrees.net>
Date: Wed, 16 Mar 2011 08:59:41 -0400
To: Robert Scanlon <rscanlon@revelytix.com>
CC: Souripriya Das <souripriya.das@oracle.com>, Richard Cyganiak <richard@cyganiak.de>, RDB2RDF WG <public-rdb2rdf-wg@w3.org>
Message-ID: <4D80B43D.7020207@thefigtrees.net>
I agree with everything Bob says here.

Furthermore, I'd suggest that while it makes some logistical sense to 
reuse the concepts of default and named graphs from SPARQL, in the 
long-run it probably makes more sense to specify R2RML in terms of the 
work being done by the new RDF working group on quads/named graphs. 
*Hopefully*, by defining the meaning of R2RML in terms of that, the 
ability to (dynamically) compose a (SPARQL) RDF dataset and query 
against the results of an R2RML mapping will then come naturally.

Lee

On 3/15/2011 6:58 PM, Robert Scanlon wrote:
> Souri, et al,
>
> I'm not sure it's necessarily helpful/useful to think of datasets in the
> context of the _query service_ (which is basically what will be
> executing R2RML mappings and exposing the generated triples).  As I
> mentioned in my last email to Richard, datasets are normally within the
> purview of the SPARQL query (aside from the minor incursion into the
> service realm by the upcoming Service Description standard).
>
> Per the SPARQL spec, a query's dataset defines the scope of the graphs
> for matching graph patterns; GRAPH graph patterns get matched against
> named graphs, and other graph patterns against the default graph.  The
> graphs can be defined explicitly in the query (named graphs in FROM
> NAMED, default graph in one of more FROM), or the protocol message, per
> the SPARQL spec; or if undefined then the query service determines what
> default and named graphs are used by queries.  But the last situation is
> more of a fall-back (although, unhelpfully, most examples use this
> mode); it is not in general 'correct' to think of the query service as
> serving up 'datasets' (imo) -- it serves up triples in the context of
> graphs, which may be 'carved up' in the context of an individual query's
> dataset, referencing the exposed graphs, to control graph pattern matching.
>
> I don't think that R2RML spec should be getting into the whole SPARQL
> graph/dataset morass aside from allowing a modeler (defining the R2RML
> mappings) to specify which triples should go in which named graphs, and
> which should go in the 'default' graph exposed by R2RML.  As I mentioned
> in my last email, the query service itself does not have to honor the
> 'suggestions' of the modeler defining the R2RML mappings (although it
> certainly could, and generally would by default).
>
> Bob Scanlon
> Revelytix
>
>
> On Tue, Mar 15, 2011 at 5:33 PM, Souripriya Das
> <souripriya.das@oracle.com <mailto:souripriya.das@oracle.com>> wrote:
>
>     Richard,
>
>     I had a long chat with Eric after the telecon today. Seema and
>     another colleague of mine, Matt Perry, too joined. Following the
>     discussion, now we are okay with the use of the term "default graph"
>     to refer to the unnamed graph in an R2RML-based RDF store.
>
>     So, please go ahead and make the minor changes needed in the current
>     draft to replace unnamed graph with default graph.
>
>     If interested, here is how I managed to convince myself in an
>     informal way, by considering triples vs. quads:
>
>         * In general, a default graph (DG, for short) can be thought of
>           as a container of *triples* in a dataset whereas the named
>           graphs contain *quads*.
>         * R2RML mapping causes triples and quads to (virtually) come
>           into existence. Among these, the *triples* (by birth) make up
>           the DG of an R2RML-based RDF store.
>         * A DG in the context of a SPARQL query on the other hand could
>           consist of triples-by-birth (from an unnamed graph) OR
>           triples-generated-via-UNION-of-SPO-projections-from-quads in
>           an RDF store.
>         * So, it is quite possible to have (DG of a SPARQL query against
>           an R2RML-based RDF store) != (DG of the target R2RML-based RDF
>           store). But the two DGs always share the characteristic that
>           both of them consist only of triples -- triples-by-birth only
>           in R2RML and triples-by-birth or triples-by-transformation in
>           SPARQL -- but neither has any quads.
>
>     Thanks,
>     - Souri.
>
>
>     On 3/15/2011 5:18 PM, Richard Cyganiak wrote:
>>     I'd like to re-iterate my position from this call that we should define the output of an R2RML mapping as an RDF Dataset in the SPARQL sense, as it already says in the introduction, and consistently use the SPARQL's terminology.
>>
>>     This would imply using the terms “named graph” and “default graph”. The term “unnamed graph” would be removed from the spec.
>>
>>     The objection raised in the call was that the default graph used in a SPARQL query can actually be constructed on the fly, on a query-by-query basis, by using the FROM keyword or SPARQL protocol parameters.
>>
>>     This is a valid observation. But I argue that this doesn't conflict at all with the use of the RDF Dataset concept and the term “default graph”.
>>
>>     To quote from the SPARQL spec [1]:
>>
>>>     A SPARQL query may specify the dataset to be used for matching by using the FROM clause and the FROM NAMED clause to describe the RDF dataset. If a query provides such a dataset description, then it is used in place of any dataset that the query service would use if no dataset description is provided in a query.
>>     This makes clear that if FROM/FROM NAMED are used, then one queries a *different* dataset from the one that the query service offers *by default* if FROM/FROM NAMED were not used.
>>
>>     I'm proposing that we think of the R2RML-generated dataset as the dataset which a query service would use by default in absence of a specific dataset description. This doesn't preclude the possibility of overriding the default graph or any other graph with FROM/FROM NAMED and the SPARQL protocol.
>>
>>     This would be a simple change in terms of spec text (s/unnamed graph/default graph/ and check the early sections for anyplace that should say “RDF dataset” instead of “RDF graph”). So I propose that we do this before the WD release.
>>
>>     If there are no objections (on- or off-list), I'll go ahead and do this.
>>
>>     Best,
>>     Richard
>>
>>     [1]http://www.w3.org/TR/rdf-sparql-query/#unnamedGraph
>
>
Received on Wednesday, 16 March 2011 13:00:20 UTC