- From: Fred Zemke <fred.zemke@oracle.com>
- Date: Tue, 28 Nov 2006 09:49:29 -0800
- CC: RDF Data Access Working Group <public-rdf-dawg@w3.org>
Whereas no one else seems concerned about query evolution involving _:a bnodes, and people have presented user feedback from testbeds that indicates it is not an issue, then I withdraw my suggestion that the scope of a bnode should be larger than a basic graph pattern. Fred Seaborne, Andy wrote: > > > Pat Hayes wrote: > >>> Eric Prud'hommeaux wrote: >>> >>> <heavy edits> >>> >>>>>> Some test cases to characterize the behavoir of the language >>>>>> apparently not captured in the current semantics: >>>>>> >>>>>> bnode-type-var [CNT]: can we count duplicate results? >>>>> > > Open. > >>>>>> >>>>>> bNode-constraint [BCN]: are bNode labels allowed in FILTERs? >>>>> > > No (they are not syntactically valid at the moment and it might lead > to trouble with optimizers moving filters around unless CNT in which > case bnodes behave like any other variable. > > Note also that bNode labels in FILTERs fails the substitution intuition: > > FILTER(_:a < 3) > > and > > ?x = _:a > FILTER(?x < 3) > > are rather different. > >>>>>> >>>>>> bNode-join [BJN]: do bNode lables bridge basic graph patterns? >>>>>> >>>>> >>> At this point I cannot decipher who wrote the preceding sentence, >>> but it >>> is an issue that I have raised. I believe that people will >>> naturally write simple >>> queries and then edit them into more complex ones. A partciularly >>> natural >>> evolution will be to test a join first, and then break it up with an >>> OPTIONAL. >>> This may cause a bnode token to appear in both operands of the >>> OPTIONAL. >>> Currently it seems that the scope of a bnode token is a basic graph >>> pattern, >>> so that means that introducing the OPTIONAL will break the join. >>> This will >>> seem counterintuitive to users. They can of course be educated to >>> always change >>> bnode tokens to variables before introducing OPTIONAL, but it will >>> frequently >>> trip users up, and may be an ongoing complaint. >> >> >> You might be right, but (1) this is all somewhat hypothetical, as we >> really don't yet know what users will in fact do, and (2) its more a >> matter of initial expectations rather than an on-going problem, since >> users will in fact get used to the rules when they use them. On the >> other hand... > > > I have never seen this occur with support questions on jena-dev and > there isa steady streams of questions coming these days. This is > because (my educated guess here, based on what I've seen) the main use > of bNodes in queries is with [], not with _:a labels. If people are > going to use _:style labels, they use named variables as they have to > name things anyway. > > Use of [] can produce some concise query expressions but they also > have the characteristic that it is unnatural to later split them up. > >>> Therefore I hope it is possible to make the scope of a bnode token as >>> large as possible. My thinking is that it would not make sense to try >>> to join on a bnode token across different graphs. Therefore every >>> GRAPH pattern must introduce a new lexical scope, similar to the way >>> block structured languages operate. bnode tokens are local to the >>> nearest containing GRAPH pattern, or the outermost pattern if none, >>> whereas variables are global to the whole query. >> > > In checking the algebra work this week, I produced a quad-based query > compiler, showing that GRAPH is just a way of specifying the fourth > slot in a quad, with the query starting off with the 4th slot being > the default graph. This matches the expectations of multi-graph stores > so it was good to check that it worked out nicely (care with custom > functions in FILTERs needed). > >>> I have worked on formulating this precisely, but it looks very >>> difficult >>> and my work is not complete. I originally thought that we could analyze >>> query trees into 'paths' (subtrees on which a join is formed); however, >>> this technique foundered on the case of an OPTIONAL with a UNION >>> in its second operand. I believe it is possible but it has eluded >>> me so far. >>> My vision is that every pattern P implies a predicate Pred(P) on >>> mappings, >>> such that the results of a query on pattern P is {mapping S | >>> Pred(P)(S) } >>> where the bnode tokens have been pulled to the front of the Pred(P) >>> as existentially quantified variables. Pred would be defined >>> recursively, >>> but the case of UNION inside the second operand of OPTIONAL has >>> eluded me. >> >> >> >> ... this all strongly suggests to me that we should not try to be >> this clever at this stage. The chances of our being able to get >> something this complicated exactly right are low, and if the result >> has to be robust enough to survive more general entailment schemes >> then they are even lower. I suggest that we strive to keep things as >> simple as we possibly can. >> >> Pat > > > We have bnodes in BGPs as an extension point, for example being able > to dispatch a BGP to a DL-reasoner. The systems that provide RDF > access to existing SQL data also use this point so it seems to hit > some kind of sweet spot. > > Extending the scope of bnode labels across OPTIONAL/UNION has not come > up as a application need. > > Fred - Your case of writing queries, with bnodes in the _:a form, does > not meet my experience. Use of _:a, as opposed to [], by application > writers appears quite rare in practice. > > As BGP's has proven natural extension point, it suggestes to me that > we have it right to scope bNodes to BGPs and not handle them in the > algebra. > > Andy > >> >> >> >>> Fred >> >> >> >
Received on Tuesday, 28 November 2006 17:50:22 UTC