- From: Dan Connolly <connolly@w3.org>
- Date: Tue, 09 Nov 2004 08:28:44 -0600
- To: Dave Beckett <dave.beckett@bristol.ac.uk>
- Cc: RDF Data Access Working Group <public-rdf-dawg@w3.org>
- Message-Id: <1100010524.4261.225.camel@dirk>
On Mon, 2004-11-08 at 16:28 +0000, Dave Beckett wrote: > Here is a rough proposal to update to > http://www.w3.org/TR/2004/WD-rdf-sparql-query-20041012/ > sections 8 and 9 with respect to the SOURCE issue. This looks reasonably clear and complete. I don't understand "Any variable that is not bound must not match..." below. Other than that, my comments should be seen as advice to the editor, should he/we choose to incorporate this proposal. > There are plenty of issues related to this discussed already but > let's see how this does. > > A quick comparision to named graphs/named containers/earlier work > - No access to individual named graphs (i.e. no SOURCE <uri>) > - It does imply dynamic RDF-merging, but it could be de-emphasised > that it's not required at run time, but the result must be as-if > that had been done. > - No bnode graph names (issue) > - Left out DISTINCT for now, that's a result thing > > Dave > > > ----- > > > 8 Choosing What to Query > > A SPARQL query is against a single RDF *Query Graph*. This graph may > be constructed through logical inference, and never materialized. It > can be arbitrarily large or infinite. The Query Graph is a virtual > RDF-merge operation over a set of *RDF Graphs*: > > Definition: Query Graph > > Given a set of RDF Graphs {RG1, ..., RGn}, the Query Graph QG is > an RDF graph formed from the RDF-merge of the set {RG1, ..., RGn}. > > All of the graphs RG1...RGn have *Graph Names* GN1...GNn which are > URI References (URIrefs) I'd probably phrase that as a mappint from graph names to RDF graphs, but your meaning is clear enough... > where RDF-merge is defined in RDF Semantics 0.3 Graph Definitions > http://www.w3.org/TR/2004/REC-rdf-mt-20040210/#graphdefs > > > The Query Graph can be defined in the following ways: > > 1) In the SPARQL query language using the FROM clause > > See below. > > 2) By the SPARQL protocol > > ISSUE: Depends on protocol doc. Probably works by giving the set > of URIrefs of the graphs? or giving a URIref for the query graph? > or query service? > > 3) Against a default query graph if neither of 1) or 2) are given. > This is application-specific. > > > In the SPARQL query language the FROM clause can specify the set of > graphs by either giving their names or giving the URIs for a resource > that can be used to retrieve the graph. > > (Q8.1) The query > > SELECT * > FROM <http://www.w3.org/2000/08/w3c-synd/home.rss> > WHERE ( ?x ?y ?z ) > > creates a Query Graph by using the resource at URI > http://www.w3.org/2000/08/w3c-synd/home.rss > to provide RDF triples, making an RDF graph RG1. Graph RG1 is named > by the URI and constructs a query graph from the set {RG1}. > > > (Q8.2) The query > > SELECT * > FROM <http://www.w3.org/2000/08/w3c-synd/home.rss> NAMED <http://example.org/> > WHERE ( ?x ?y ?z ) > > Constructs the same query graph but names the graph RG1 <http://example.org/> > > (Q8.3) The query > > SELECT * > FROM NAMED <http://example.org/> > WHERE ( ?x ?y ?z ) > > Creates a query graph from a set of 1 graph named <http://example.org/> > The URI here is not for resource retrieval. > > > When multiple graphs are given in FROM, the RDF-merge of the set of > graphs is performed to create the query graph. > > The query > (Q8.4) > SELECT * > FROM <uri1>, <uri2> > WHERE ( ?x ?y ?z ) > > creates a query graph from the RDF-merge from the set of graphs {RG1, RG2} > where > RG1 is the RDF graph formed by retrieving the resource at uri1 and > named uri1 > RG2 is the RDF graph formed by retrieving the resource at uri2 and > named uri2 > > > A SPARQL implementation MAY not support graph names in which case the > queries that use only the NAMED keyword will fail - Q8.3 To date, the SPARQL spec hasn't defined a term like "SPARQL implementation", and I don't recall "fail" so far either. I can't tell what Q8.3 refers to. > > Possible extension: > > Allow graphs with a local name (blank node label) > > (Q8.5) > SELECT * > FROM NAMED _:a, NAMED _:b > WHERE ( ?x ?y ?z ) > > rather than relying on the application-specific choice 3) above. > > However details below would have to be changed to forbid returning > the blank nodes of the names in results. > > > 9 Querying the Origin of Statements > > While the RDF data model is limited to expressing triples with a > subject, predicate and object, many RDF data stores augment this with > a notion of the source of each triple. Typically, implementations > associate RDF triples or graphs with a URI specifying their real or > virtual origin. The SOURCE keyword allows you to query or constrain > the source of the following triple pattern or nested graph > pattern. The general form of the SOURCE query is: > > SOURCE ?var (?s ?p ?o) > > When SOURCE ?var is given before a triple, the variable will be bound > to all of the known *Graph Names* for that triple. I gather that "known" refers to the Query Graph QG. > A data store that > does not support graph names SHOULD provide no binding for the SOURCE > variables. Again the normative-looking reference to software. So far the editors have kept that sort of thing to informative prose and kept the definitions of things like query result independent of it. I think you're suggesting that there are 2 query results for queries that use SOURCE and that implementations are free to return either. Is that right? > D9.1 Data: > > Graph G1 named <aliceFoaf.n3> > @prefix foaf: <http://xmlns.com/foaf/0.1/> . > > _:1 foaf:mbox <mailto:alice@work.example>. > _:1 foaf:knows _:2. > _:2 foaf:mbox <mailto:bob@work.example>. > _:2 foaf:age 32. > > Graph G2 named <bobFoaf.n3> > @prefix foaf: <http://xmlns.com/foaf/0.1/> . > > _:1 foaf:mbox <mailto:bob@work.example>. > _:1 foaf:PersonalProfileDocument <bobFoaf.n3>. > _:1 foaf:age 35. > > > The Query Graph is the RDF-merge of {G1, G2} > > > Q9.1 Query: > > PREFIX foaf: <http://xmlns.com/foaf/0.1/> > SELECT ?mbox ?age ?ppd > WHERE ( ?alice foaf:mbox <mailto:alice@work.example> ) > ( ?alice foaf:knows ?whom ) > ( ?whom foaf:mbox ?mbox ) > ( ?whom foaf:PersonalProfileDocument ?ppd ) > SOURCE ?ppd ( ?whom foaf:age ?age ) > > R9.1 Result: > mbox age ppd > <mailto:bob@work.example> 35 <bobFoaf.n3> There are two possible results, right? one with ppd unbound? > This query returns the email addresses of people that Alice knows. It > also returns their age according to their PersonalProfileDocument > documents, as well as the URI of the graph. Alice's guess of Bob's > age (32) is not returned. The example is good. > > Any variable that is not bound must not match another variable that > is not bound. I don't understand that sentence, even after studying the example a few times. Hmm. > Thus, > > Query Q9.2: > PREFIX foaf: <http://xmlns.com/foaf/0.1/> > SELECT ?given ?family > WHERE SOURCE ?ppd ( ?whom foaf:given ?family ) > SOURCE ?ppd ( ?whom foaf:family ?family ) > > will match only if the source of both triples are known and the same. > > A SPARQL implementation MAY not support graph names in which case the > SOURCE ?var parts are ignored. If I understand the proposal, that can be phrased without reference to implementations by saying, as above, that there are two possible results to any query that uses SOURCE. > > ----------------- > > References > > Named Containers > http://lists.w3.org/Archives/Public/public-rdf-dawg/2004JulSep/0581.html > > Named Graphs and TriX > http://www.w3.org/2004/03/trix/ > > Named Graphs, Provenance and Trust > Carroll, Jeremy J.; Bizer, Christian; Hayes, Patrick; Stickler, Patrick > HPL-2004-57, 20040513 > http://hpl.hp.com/techreports/2004/HPL-2004-57.html see also Reaching out onto the Web http://www.w3.org/2000/10/swap/doc/Reach from the SWAP tutorial http://www.w3.org/2000/10/swap/doc/ > ... -- Dan Connolly, W3C http://www.w3.org/People/Connolly/ D3C2 887B 0F92 6005 C541 0875 0F91 96DE 6E52 C29E
Received on Tuesday, 9 November 2004 14:28:35 UTC