- From: Eric Prud'hommeaux <eric@w3.org>
- Date: Wed, 16 Mar 2011 10:44:41 -0400
- To: Lee Feigenbaum <lee@thefigtrees.net>
- Cc: Robert Scanlon <rscanlon@revelytix.com>, Souripriya Das <souripriya.das@oracle.com>, Richard Cyganiak <richard@cyganiak.de>, RDB2RDF WG <public-rdb2rdf-wg@w3.org>
* Lee Feigenbaum <lee@thefigtrees.net> [2011-03-16 08:59-0400] > I agree with everything Bob says here. > > Furthermore, I'd suggest that while it makes some logistical sense > to reuse the concepts of default and named graphs from SPARQL, in > the long-run it probably makes more sense to specify R2RML in terms > of the work being done by the new RDF working group on quads/named > graphs. *Hopefully*, by defining the meaning of R2RML in terms of > that, the ability to (dynamically) compose a (SPARQL) RDF dataset > and query against the results of an R2RML mapping will then come > naturally. Could you describe the current modeling of graphs in the SPARQL WG? In particular, is there new concepts for binding default graphs to potentially unnamed graphs? Are there mf:QueryEvaluationTests which involve both a FROM clause and qt:data graph specifier in the test description? (There aren't in SPARQL 1.0 tests http://www.w3.org/2001/sw/DataAccess/tests/data-r2/ .) In http://www.w3.org/mid/20110315214913.GC4955@w3.org , I listed some desiderata and asked how my misunderstanding of the "first FROM replaces default graph" rule would change the outcome. So far, the SPARQL spec is the only spec to talk about a graph without a specified name (the Trix abstract syntax only defines named graphs¹), so I think we have to take our definitions from there. Apart from SPARQL, no specifications exploit named or default graphs. ¹http://www.w3.org/2004/03/trix/#asyntasem Mostly, providers of SPARQL endpoints elect to consider particular graphs as the default graph and the querier may or may not get to change them. In order to justify inventing a new kind of "unnamed graph", we should clarify, at least in our own minds, how folks can query it. Are people going to invoke rdb2rdfEndpoint -c myDB.r2rml -d unnamed ? Does this give them any more utility than calling the graph r2rml:defaultGraph ? I understand the gut feeling that we should avoid committing to a specified behavior in the face of FROM clauses, but I think we lose interoperability without providing additional administrator or user control. * Richard Cyganiak <richard@cyganiak.de> [2011-03-15 21:18+0000] > … > To quote from the SPARQL spec [1]: > > A SPARQL query may specify the dataset to be used for matching by > using the FROM clause and the FROM NAMED clause to describe the RDF > dataset. If a query provides such a dataset description, then it is > used in place of any dataset that the query service would use if no > dataset description is provided in a query. > > This makes clear that if FROM/FROM NAMED are used, then one queries > a *different* dataset from the one that the query service offers *by > default* if FROM/FROM NAMED were not used. > > I'm proposing that we think of the R2RML-generated dataset as the > dataset which a query service would use by default in absence of a > specific dataset description. This doesn't preclude the possibility > of overriding the default graph or any other graph with FROM/FROM > NAMED and the SPARQL protocol. +1 This gives a good default behavior and doesn't preclude manipulations of the default graph. > Lee > > On 3/15/2011 6:58 PM, Robert Scanlon wrote: > >Souri, et al, > > > >I'm not sure it's necessarily helpful/useful to think of datasets in the > >context of the _query service_ (which is basically what will be > >executing R2RML mappings and exposing the generated triples). As I > >mentioned in my last email to Richard, datasets are normally within the > >purview of the SPARQL query (aside from the minor incursion into the > >service realm by the upcoming Service Description standard). > > > >Per the SPARQL spec, a query's dataset defines the scope of the graphs > >for matching graph patterns; GRAPH graph patterns get matched against > >named graphs, and other graph patterns against the default graph. The > >graphs can be defined explicitly in the query (named graphs in FROM > >NAMED, default graph in one of more FROM), or the protocol message, per > >the SPARQL spec; or if undefined then the query service determines what > >default and named graphs are used by queries. But the last situation is > >more of a fall-back (although, unhelpfully, most examples use this > >mode); it is not in general 'correct' to think of the query service as > >serving up 'datasets' (imo) -- it serves up triples in the context of > >graphs, which may be 'carved up' in the context of an individual query's > >dataset, referencing the exposed graphs, to control graph pattern matching. > > > >I don't think that R2RML spec should be getting into the whole SPARQL > >graph/dataset morass aside from allowing a modeler (defining the R2RML > >mappings) to specify which triples should go in which named graphs, and > >which should go in the 'default' graph exposed by R2RML. As I mentioned > >in my last email, the query service itself does not have to honor the > >'suggestions' of the modeler defining the R2RML mappings (although it > >certainly could, and generally would by default). > > > >Bob Scanlon > >Revelytix > > > > > >On Tue, Mar 15, 2011 at 5:33 PM, Souripriya Das > ><souripriya.das@oracle.com <mailto:souripriya.das@oracle.com>> wrote: > > > > Richard, > > > > I had a long chat with Eric after the telecon today. Seema and > > another colleague of mine, Matt Perry, too joined. Following the > > discussion, now we are okay with the use of the term "default graph" > > to refer to the unnamed graph in an R2RML-based RDF store. > > > > So, please go ahead and make the minor changes needed in the current > > draft to replace unnamed graph with default graph. > > > > If interested, here is how I managed to convince myself in an > > informal way, by considering triples vs. quads: > > > > * In general, a default graph (DG, for short) can be thought of > > as a container of *triples* in a dataset whereas the named > > graphs contain *quads*. > > * R2RML mapping causes triples and quads to (virtually) come > > into existence. Among these, the *triples* (by birth) make up > > the DG of an R2RML-based RDF store. > > * A DG in the context of a SPARQL query on the other hand could > > consist of triples-by-birth (from an unnamed graph) OR > > triples-generated-via-UNION-of-SPO-projections-from-quads in > > an RDF store. > > * So, it is quite possible to have (DG of a SPARQL query against > > an R2RML-based RDF store) != (DG of the target R2RML-based RDF > > store). But the two DGs always share the characteristic that > > both of them consist only of triples -- triples-by-birth only > > in R2RML and triples-by-birth or triples-by-transformation in > > SPARQL -- but neither has any quads. > > > > Thanks, > > - Souri. > > > > > > On 3/15/2011 5:18 PM, Richard Cyganiak wrote: > >> I'd like to re-iterate my position from this call that we should define the output of an R2RML mapping as an RDF Dataset in the SPARQL sense, as it already says in the introduction, and consistently use the SPARQL's terminology. > >> > >> This would imply using the terms “named graph” and “default graph”. The term “unnamed graph” would be removed from the spec. > >> > >> The objection raised in the call was that the default graph used in a SPARQL query can actually be constructed on the fly, on a query-by-query basis, by using the FROM keyword or SPARQL protocol parameters. > >> > >> This is a valid observation. But I argue that this doesn't conflict at all with the use of the RDF Dataset concept and the term “default graph”. > >> > >> To quote from the SPARQL spec [1]: > >> > >>> A SPARQL query may specify the dataset to be used for matching by using the FROM clause and the FROM NAMED clause to describe the RDF dataset. If a query provides such a dataset description, then it is used in place of any dataset that the query service would use if no dataset description is provided in a query. > >> This makes clear that if FROM/FROM NAMED are used, then one queries a *different* dataset from the one that the query service offers *by default* if FROM/FROM NAMED were not used. > >> > >> I'm proposing that we think of the R2RML-generated dataset as the dataset which a query service would use by default in absence of a specific dataset description. This doesn't preclude the possibility of overriding the default graph or any other graph with FROM/FROM NAMED and the SPARQL protocol. > >> > >> This would be a simple change in terms of spec text (s/unnamed graph/default graph/ and check the early sections for anyplace that should say “RDF dataset” instead of “RDF graph”). So I propose that we do this before the WD release. > >> > >> If there are no objections (on- or off-list), I'll go ahead and do this. > >> > >> Best, > >> Richard > >> > >> [1]http://www.w3.org/TR/rdf-sparql-query/#unnamedGraph > > > > > -- -ericP
Received on Wednesday, 16 March 2011 14:45:17 UTC