- From: Andy Seaborne <andy.seaborne@talis.com>
- Date: Sun, 14 Mar 2010 12:49:43 +0000
- To: Axel Polleres <axel.polleres@deri.org>
- CC: SPARQL Working Group <public-rdf-dawg@w3.org>
> Bottom-line: > + BNODE(String) per-result-set semantics, > + BNODE() per-row semantics (or []) > + subSELECTs with FROM clauses allowed > seem to address all my use case and also Claudio's use case, > (although nested CONSTRUCTs seem easier to write at times...) Agreed, assuming BNODE() generates a fresh bnode per call. i.e. SELECT (BNODE() AS ?B1) (BNODE() AS ?B2) ... and ?B1 != ?B2 A few matters with the examples noted below but I don't think they invalidate the argument. Also note we have a proposal for parameters to aggregators, as well as group arguments, and using "[" and "]" seems sensible because that's in expressions and so obviously aren't related to bnodes. Andy On 12/03/2010 8:13 PM, Axel Polleres wrote: > Hi, > > the mail exchange with Andy brought me back to that I wanted to inquire about the uses for nested CONSTRUCTs > before we close this issue. So, this will a bit lengthy... whoever is afraid to get bored, just look at the > end of the mail where you find a summary ;-) > > There are two basic uses I stumbled over, and which also came out in the discussions which I had on this > issue with Claudio, Renzo (the group from Chile). > > 1) Modularity > 2) Blank nodes per-result-set > > The recent mails with Andy were mainly on 2) . Let me elaborate both use cases again, how they'd be > done with nested CONSTRUCTs and what would be alternatives in our current design. > > > 1) Modularity: > ============== > > Example1: > --------- > merge information from different graphs using different vocabularies > > Given three graphs which have information of documents and their authors > > G1: :Doc ex1:author "John Doe", "Sean Dough". > :Doc2 ex1:author "John Doe", "Joan Doe". > > ... > > G2: :A1 foaf:made :D1; :D2; :D3. > :A1 foaf:name "Joan Doe" > :A2 foaf:made ;D3. > :A2 foaf:name "Sean Dough" > ... > > G3: :doc3 dc:creator [foaf:name "Jane D."] > > > which we want to aggregate using CONSTRUCT to dc:creator - foaf:knows. > > > OPTION1: that could look as follows using nested CONSTRUCTs > > CONSTRUCT WHERE { ?Doc dc:creator ?Auth . ?Auth foaf:name ?Name } > FROM > { CONSTRUCT {?Doc dc:creator [ foaf:name ?Name ] } FROM<G1> > WHERE { ?Doc ex:author ?Name } > FROM > { CONSTRUCT {?Doc dc:creator ?Auth . ?Auth foaf:name ?Name } FROM<G2> > WHERE { ?Auth foaf:made ?Doc; foaf:name ?Name } > FROM > { CONSTRUCT WHERE {?Doc dc:creator ?Auth. ?Auth foaf:name ?Name } FROM<G3> } > > > OPTION2: Using a "normal" CONSTRUCT, this could still be done, but would be potentially harder/more expensive to evaluate, > since one would need UNION queries, the evaluation of which might be more difficult to modularise: > > CONSTRUCT { ?Doc dc:creator ?Auth . ?Auth foaf:name ?Name } > FROM<G1> > FROM<G2> > FROM<G3> > WHERE { > { ?Doc ex:author ?Name } This clause of the UNION does nothing in the CONSTRUCT because ?Auth is not bound. The construct equivalent had [] to introduce the bnode. > UNION > { ?Auth foaf:made ?Doc; foaf:name ?Name } > UNION > { ?Doc dc:creator ?Auth. ?Auth foaf:name ?Name } > } > > OPTION3: A version using named graphs might be better suited: > > CONSTRUCT { ?Doc dc:creator ?Auth . ?Auth foaf:name ?Name } > FROM NAMED<G1> > FROM NAMED<G2> > FROM NAMED<G3> > WHERE { > { GRAPH<G1> {?Doc ex:author ?Name } } > UNION > { GRAPH<G2> {?Auth foaf:made ?Doc; foaf:name ?Name } } > UNION > { GRAPH<G3> { ?Doc dc:creator ?Auth. ?Auth foaf:name ?Name } } > } > > OPTION4: Subselects (need blank node construction and FROM in sub-SELECTs !!!!) seem work nicely here... > > CONSTRUCT { ?Doc dc:creator ?Auth . ?Auth foaf:name ?Name } > WHERE { > { SELECT ?Doc, [] as ?Auth, ?Name FROM<G1> > WHERE { ?Doc ex:author ?Name } } Can I say "assignment" here? :-) > UNION > { SELECT ?Doc ?Auth ?Name FROM<G2> > { ?Auth foaf:made ?Doc; foaf:name ?Name } } > UNION > { SELECT ?Doc ?Auth ?Name FROM<G3> > WHERE { ?Doc dc:creator ?Auth. ?Auth foaf:name ?Name } } > } > > Example 2: > ---------- > (from Claudio): This is a bit theoretical, but should make the point, this is about deeply nested > CONSTRUCTs. > > Claudio Gutierrez wrote: >> The query is: >> "Give me the neighbors at distance N of the resource xo. (That is, the >> nodes for which thre is a path (x0, _, x1) (x1,-,x2)....(xk, -, x_{N+1}) >> >> Here is the version with CONSTRUCT: >> >> Let Q0: >> >> (CONSTRUCT (x_0 ?y ?z) >> FROM G >> WHERE (x0 ?y ?z) >> >> and let Q_{k+1}: >> >> (CONSTRUCT (?x ?y ?z) >> FROM Q_k >> FROM NAMED 1 G >> WHERE (?a ?b ?x) AND (1 GRAPH (?x ?y ? z)) >> >> The desired query is Q_N. >> >> (Note: Without using the nested CONSTRUCT I am pretty sure there is *no* >> way of outputing a table of exponential size. As example consider the >> graph G: >> >> G = { (a0, p1, b0), (a0, p2, b0), ....., (aN, p1, bN), (aN, p2, bN),} >> >> and the neighborhood of size N of a0. >> >> BUT, with CONSTRUCT the query is linear in N and the size of G. >> >> best, >> >> claudio > > > > This query though is probably also writeable by SUBSELECTs? I think this: > > SELECT ?Xn > { ... > { SELECT DISTINCT ?X2 > WHERE { { SELECT DISTINCT ?X1 > WHERE {x0 ?Y1 ?X1} } > ?X1 ?Y1 ?X2 } } > ... } > {?X_{n-1} ?Y_{n-1} Xn} > } > > would do the job. I am not 100% sure whether that works... opinions? Maybe property paths are a better approach. And like Steve I don't see the connection to CONSTRUCT. > > 2) Blank nodes per result set. > ============================== > > This general problem occurs when you want for instance to mint blank nodes for certain > literal values, which you want to preserve. > > Example 3: > ---------- > > Take the graph G1 from above, where you want to output foaf:knows relations between joint authors of documents, > Since you don't have identifiers for the persons, but only names, you want to use blank nodes, but you want to > preserve the knowledge that in G1 author names are unique, that is, you want co-reference between those blank nodes. > > The naive version > > CONSTRUCT { _:a foaf:knows _:b . _:a foaf:name ?n1 . _:b foaf:name ?n2 . } FROM<G1> WHERE { ?D ex1:author ?n1, ?n2 . FILTER ( ?n1 != ?n2 ) } > > does not preserve blank-node correlation, whereas this version does: > > CONSTRUCT { ?a knows ?b . ?a foaf:name ?aname . ?b foaf:name ?bname . } > FROM { CONSTRUCT { _:auth foaf:name ?n . ?p aux:hasAuthor _:auth . } > FROM<G1> WHERE { ?p ex1:author ?n . } } > WHERE { ?p aux:hasAuthor ?a . ?a foaf:name ?aName . > ?p aux:hasAuthor ?b . ?b foaf:name ?bName . > FILTER ( ?a != ?b ) } > > The inner CONSTRUCT mints a unique Bnode per name, which is then used in the outer CONSTRUCT. > > Let's now see how that would look like with SUBSELECT . > > OPTION 1: SELECT [] > > CONSTRUCT { ?Auth1 knows ?Auth2 . ?a foaf:name ?aname . ?b foaf:name ?bname . } > FROM<G1> > WHERE > { { SELECT DISTINCT [] AS ?Auth1 , ?N1 WHERE { ?p dc:creator ?N1 . } } DISTINCT does nothing because a new term is introduced in each row so each row is unique. > { SELECT DISTINCT [] AS ?Auth2 , ?N2 WHERE { ?p dc:creator ?N2 . } } > ?P ex1:author ?N1,? N2 > FILTER( ?N1 = ?N2 ) ?? FILTER( ?N1 != ?N2 ) from just looking at FILTER ( ?a != ?b ) > } > > I wouldn't know how to do this with only one subSELECT. Not very appealing. > > OPTION 2: BNODE() > > CONSTRUCT { ?Auth1 knows ?Auth2 . ?a foaf:name ?N1 . ?b foaf:name ?N2 . } > FROM<G1> > WHERE > { { SELECT DISTINCT BNODE() AS ?Auth1 , ?N1 WHERE { ?p dc:creator ?N1 . } } > { SELECT DISTINCT BNODE() AS ?Auth2 , ?N2 WHERE { ?p dc:creator ?N2 . } } > ?P ex1:author ?N1,? N2 > FILTER( ?N1 = ?N2 ) > } > > Same as Option2, but with a function instead of [], as proposed by Andy. > > OPTION 3: with per-result-set BNODE("string") this would be doable even without a SUBSELECT: > > CONSTRUCT { ?Auth1 knows ?Auth2 . ?Auth1 foaf:name ?N1 . ?Auth2 foaf:name ?N2 . } > WHERE > { SELECT BNODE(?N1) AS ?Auth1 , BNODE(?N2) AS ?Auth2, ?N1, ?N2 > WHERE { ex1:author ?N1,? N2 FILTER( ?N1 = ?N2 ) } > } > > This actually speaks a lot for the per-result-set BNODE() function .... > > > Bottom-line: > + BNODE(String) per-result-set semantics, > + BNODE() per-row semantics (or []) that is unique on each call? SELECT (BNODE() AS ?B1) (BNODE() AS ?B2) ... and ?B1 != ?B2 > + subSELECTs with FROM clauses allowed > seem to address all my use case and also Claudio's use case, > (although nested CONSTRUCTs seem easier to write at times...) > > if someone could confirm my subselect rewriting sketch for Example 2, > could we go with that? > > best, > Axel
Received on Sunday, 14 March 2010 12:50:42 UTC