- From: Fred Zemke <fred.zemke@oracle.com>
- Date: Tue, 28 Nov 2006 09:49:29 -0800
- CC: RDF Data Access Working Group <public-rdf-dawg@w3.org>
Whereas no one else seems concerned about query evolution involving
_:a bnodes, and people have presented user feedback from testbeds
that indicates it is not an issue, then I withdraw my suggestion that the
scope of a bnode should be larger than a basic graph pattern.
Fred
Seaborne, Andy wrote:
>
>
> Pat Hayes wrote:
>
>>> Eric Prud'hommeaux wrote:
>>>
>>> <heavy edits>
>>>
>>>>>> Some test cases to characterize the behavoir of the language
>>>>>> apparently not captured in the current semantics:
>>>>>>
>>>>>> bnode-type-var [CNT]: can we count duplicate results?
>>>>>
>
> Open.
>
>>>>>>
>>>>>> bNode-constraint [BCN]: are bNode labels allowed in FILTERs?
>>>>>
>
> No (they are not syntactically valid at the moment and it might lead
> to trouble with optimizers moving filters around unless CNT in which
> case bnodes behave like any other variable.
>
> Note also that bNode labels in FILTERs fails the substitution intuition:
>
> FILTER(_:a < 3)
>
> and
>
> ?x = _:a
> FILTER(?x < 3)
>
> are rather different.
>
>>>>>>
>>>>>> bNode-join [BJN]: do bNode lables bridge basic graph patterns?
>>>>>>
>>>>>
>>> At this point I cannot decipher who wrote the preceding sentence,
>>> but it
>>> is an issue that I have raised. I believe that people will
>>> naturally write simple
>>> queries and then edit them into more complex ones. A partciularly
>>> natural
>>> evolution will be to test a join first, and then break it up with an
>>> OPTIONAL.
>>> This may cause a bnode token to appear in both operands of the
>>> OPTIONAL.
>>> Currently it seems that the scope of a bnode token is a basic graph
>>> pattern,
>>> so that means that introducing the OPTIONAL will break the join.
>>> This will
>>> seem counterintuitive to users. They can of course be educated to
>>> always change
>>> bnode tokens to variables before introducing OPTIONAL, but it will
>>> frequently
>>> trip users up, and may be an ongoing complaint.
>>
>>
>> You might be right, but (1) this is all somewhat hypothetical, as we
>> really don't yet know what users will in fact do, and (2) its more a
>> matter of initial expectations rather than an on-going problem, since
>> users will in fact get used to the rules when they use them. On the
>> other hand...
>
>
> I have never seen this occur with support questions on jena-dev and
> there isa steady streams of questions coming these days. This is
> because (my educated guess here, based on what I've seen) the main use
> of bNodes in queries is with [], not with _:a labels. If people are
> going to use _:style labels, they use named variables as they have to
> name things anyway.
>
> Use of [] can produce some concise query expressions but they also
> have the characteristic that it is unnatural to later split them up.
>
>>> Therefore I hope it is possible to make the scope of a bnode token as
>>> large as possible. My thinking is that it would not make sense to try
>>> to join on a bnode token across different graphs. Therefore every
>>> GRAPH pattern must introduce a new lexical scope, similar to the way
>>> block structured languages operate. bnode tokens are local to the
>>> nearest containing GRAPH pattern, or the outermost pattern if none,
>>> whereas variables are global to the whole query.
>>
>
> In checking the algebra work this week, I produced a quad-based query
> compiler, showing that GRAPH is just a way of specifying the fourth
> slot in a quad, with the query starting off with the 4th slot being
> the default graph. This matches the expectations of multi-graph stores
> so it was good to check that it worked out nicely (care with custom
> functions in FILTERs needed).
>
>>> I have worked on formulating this precisely, but it looks very
>>> difficult
>>> and my work is not complete. I originally thought that we could analyze
>>> query trees into 'paths' (subtrees on which a join is formed); however,
>>> this technique foundered on the case of an OPTIONAL with a UNION
>>> in its second operand. I believe it is possible but it has eluded
>>> me so far.
>>> My vision is that every pattern P implies a predicate Pred(P) on
>>> mappings,
>>> such that the results of a query on pattern P is {mapping S |
>>> Pred(P)(S) }
>>> where the bnode tokens have been pulled to the front of the Pred(P)
>>> as existentially quantified variables. Pred would be defined
>>> recursively,
>>> but the case of UNION inside the second operand of OPTIONAL has
>>> eluded me.
>>
>>
>>
>> ... this all strongly suggests to me that we should not try to be
>> this clever at this stage. The chances of our being able to get
>> something this complicated exactly right are low, and if the result
>> has to be robust enough to survive more general entailment schemes
>> then they are even lower. I suggest that we strive to keep things as
>> simple as we possibly can.
>>
>> Pat
>
>
> We have bnodes in BGPs as an extension point, for example being able
> to dispatch a BGP to a DL-reasoner. The systems that provide RDF
> access to existing SQL data also use this point so it seems to hit
> some kind of sweet spot.
>
> Extending the scope of bnode labels across OPTIONAL/UNION has not come
> up as a application need.
>
> Fred - Your case of writing queries, with bnodes in the _:a form, does
> not meet my experience. Use of _:a, as opposed to [], by application
> writers appears quite rare in practice.
>
> As BGP's has proven natural extension point, it suggestes to me that
> we have it right to scope bNodes to BGPs and not handle them in the
> algebra.
>
> Andy
>
>>
>>
>>
>>> Fred
>>
>>
>>
>
Received on Tuesday, 28 November 2006 17:50:22 UTC