- From: Souripriya Das <souripriya.das@oracle.com>
- Date: Wed, 16 Mar 2011 10:03:32 -0400
- To: Lee Feigenbaum <lee@thefigtrees.net>
- CC: Robert Scanlon <rscanlon@revelytix.com>, Richard Cyganiak <richard@cyganiak.de>, RDB2RDF WG <public-rdb2rdf-wg@w3.org>
- Message-ID: <4D80C334.80809@oracle.com>
I agree with all of the viewpoints expressed so far.
Just for my own understanding, I am distinguishing between two types of
triples:
* "native" triples: they exist as triples in the (virtual) store
* "converted" triples: they exist as quads in the (virtual) store,
but were converted to triples by projecting out the graph information
Using these two terms, I distinguish the nature of content of Default
Graph in a store vs. that of Default Graph in the context of a SPARQL
query's execution as follows:
* Default graph for a store consist only of "native" triples.
* Default graph for a SPARQL query execution consist of "native"
triples and/or "converted" triples.
This seems to work for me at the moment, until I get a better
understanding based upon what comes out from the RDF WG.
Thanks,
- Souri.
Lee Feigenbaum wrote:
> I agree with everything Bob says here.
>
> Furthermore, I'd suggest that while it makes some logistical sense to
> reuse the concepts of default and named graphs from SPARQL, in the
> long-run it probably makes more sense to specify R2RML in terms of the
> work being done by the new RDF working group on quads/named graphs.
> *Hopefully*, by defining the meaning of R2RML in terms of that, the
> ability to (dynamically) compose a (SPARQL) RDF dataset and query
> against the results of an R2RML mapping will then come naturally.
>
> Lee
>
> On 3/15/2011 6:58 PM, Robert Scanlon wrote:
>> Souri, et al,
>>
>> I'm not sure it's necessarily helpful/useful to think of datasets in the
>> context of the _query service_ (which is basically what will be
>> executing R2RML mappings and exposing the generated triples). As I
>> mentioned in my last email to Richard, datasets are normally within the
>> purview of the SPARQL query (aside from the minor incursion into the
>> service realm by the upcoming Service Description standard).
>>
>> Per the SPARQL spec, a query's dataset defines the scope of the graphs
>> for matching graph patterns; GRAPH graph patterns get matched against
>> named graphs, and other graph patterns against the default graph. The
>> graphs can be defined explicitly in the query (named graphs in FROM
>> NAMED, default graph in one of more FROM), or the protocol message, per
>> the SPARQL spec; or if undefined then the query service determines what
>> default and named graphs are used by queries. But the last situation is
>> more of a fall-back (although, unhelpfully, most examples use this
>> mode); it is not in general 'correct' to think of the query service as
>> serving up 'datasets' (imo) -- it serves up triples in the context of
>> graphs, which may be 'carved up' in the context of an individual query's
>> dataset, referencing the exposed graphs, to control graph pattern
>> matching.
>>
>> I don't think that R2RML spec should be getting into the whole SPARQL
>> graph/dataset morass aside from allowing a modeler (defining the R2RML
>> mappings) to specify which triples should go in which named graphs, and
>> which should go in the 'default' graph exposed by R2RML. As I mentioned
>> in my last email, the query service itself does not have to honor the
>> 'suggestions' of the modeler defining the R2RML mappings (although it
>> certainly could, and generally would by default).
>>
>> Bob Scanlon
>> Revelytix
>>
>>
>> On Tue, Mar 15, 2011 at 5:33 PM, Souripriya Das
>> <souripriya.das@oracle.com <mailto:souripriya.das@oracle.com>> wrote:
>>
>> Richard,
>>
>> I had a long chat with Eric after the telecon today. Seema and
>> another colleague of mine, Matt Perry, too joined. Following the
>> discussion, now we are okay with the use of the term "default graph"
>> to refer to the unnamed graph in an R2RML-based RDF store.
>>
>> So, please go ahead and make the minor changes needed in the current
>> draft to replace unnamed graph with default graph.
>>
>> If interested, here is how I managed to convince myself in an
>> informal way, by considering triples vs. quads:
>>
>> * In general, a default graph (DG, for short) can be thought of
>> as a container of *triples* in a dataset whereas the named
>> graphs contain *quads*.
>> * R2RML mapping causes triples and quads to (virtually) come
>> into existence. Among these, the *triples* (by birth) make up
>> the DG of an R2RML-based RDF store.
>> * A DG in the context of a SPARQL query on the other hand could
>> consist of triples-by-birth (from an unnamed graph) OR
>> triples-generated-via-UNION-of-SPO-projections-from-quads in
>> an RDF store.
>> * So, it is quite possible to have (DG of a SPARQL query against
>> an R2RML-based RDF store) != (DG of the target R2RML-based RDF
>> store). But the two DGs always share the characteristic that
>> both of them consist only of triples -- triples-by-birth only
>> in R2RML and triples-by-birth or triples-by-transformation in
>> SPARQL -- but neither has any quads.
>>
>> Thanks,
>> - Souri.
>>
>>
>> On 3/15/2011 5:18 PM, Richard Cyganiak wrote:
>>> I'd like to re-iterate my position from this call that we should
>>> define the output of an R2RML mapping as an RDF Dataset in the
>>> SPARQL sense, as it already says in the introduction, and
>>> consistently use the SPARQL's terminology.
>>>
>>> This would imply using the terms “named graph” and “default
>>> graph”. The term “unnamed graph” would be removed from the spec.
>>>
>>> The objection raised in the call was that the default graph used
>>> in a SPARQL query can actually be constructed on the fly, on a
>>> query-by-query basis, by using the FROM keyword or SPARQL protocol
>>> parameters.
>>>
>>> This is a valid observation. But I argue that this doesn't
>>> conflict at all with the use of the RDF Dataset concept and the term
>>> “default graph”.
>>>
>>> To quote from the SPARQL spec [1]:
>>>
>>>> A SPARQL query may specify the dataset to be used for matching
>>>> by using the FROM clause and the FROM NAMED clause to describe the
>>>> RDF dataset. If a query provides such a dataset description, then
>>>> it is used in place of any dataset that the query service would use
>>>> if no dataset description is provided in a query.
>>> This makes clear that if FROM/FROM NAMED are used, then one
>>> queries a *different* dataset from the one that the query service
>>> offers *by default* if FROM/FROM NAMED were not used.
>>>
>>> I'm proposing that we think of the R2RML-generated dataset as
>>> the dataset which a query service would use by default in absence of
>>> a specific dataset description. This doesn't preclude the
>>> possibility of overriding the default graph or any other graph with
>>> FROM/FROM NAMED and the SPARQL protocol.
>>>
>>> This would be a simple change in terms of spec text (s/unnamed
>>> graph/default graph/ and check the early sections for anyplace that
>>> should say “RDF dataset” instead of “RDF graph”). So I propose that
>>> we do this before the WD release.
>>>
>>> If there are no objections (on- or off-list), I'll go ahead and
>>> do this.
>>>
>>> Best,
>>> Richard
>>>
>>> [1]http://www.w3.org/TR/rdf-sparql-query/#unnamedGraph
>>
>>
Received on Wednesday, 16 March 2011 14:06:01 UTC