Re: ACTION: counting test case [Was: Re: Agenda request: characterize the diffs between subgraph-matching and E-entailment] from Fred Zemke on 2006-11-27 (public-rdf-dawg@w3.org from October to December 2006)

From: Fred Zemke <fred.zemke@oracle.com>
Date: Mon, 27 Nov 2006 10:30:51 -0800
CC: RDF Data Access Working Group <public-rdf-dawg@w3.org>
Message-ID: <456B2EDB.9030209@oracle.com>

Eric Prud'hommeaux wrote:

<heavy edits>

>>>
>>>Some test cases to characterize the behavoir of the language
>>>apparently not captured in the current semantics:
>>>
>>>  bnode-type-var [CNT]: can we count duplicate results?
>>>
>>>  bNode-constraint [BCN]: are bNode labels allowed in FILTERs?
>>>
>>>  bNode-join [BJN]: do bNode lables bridge basic graph patterns?
>>>      
>>>
At this point I cannot decipher who wrote the preceding sentence, but it
is an issue that I have raised.  I believe that people will naturally 
write simple
queries and then edit them into more complex ones.  A partciularly natural
evolution will be to test a join first, and then break it up with an 
OPTIONAL.
This may cause a bnode token to appear in both operands of the OPTIONAL.
Currently it seems that the scope of a bnode token is a basic graph pattern,
so that means that introducing the OPTIONAL will break the join.  This will
seem counterintuitive to users.  They can of course be educated to 
always change
bnode tokens to variables before introducing OPTIONAL, but it will 
frequently
trip users up, and may be an ongoing complaint.

Therefore I hope it is possible to make the scope of a bnode token as
large as possible.  My thinking is that it would not make sense to try
to join on a bnode token across different graphs.  Therefore every
GRAPH pattern must introduce a new lexical scope, similar to the way
block structured languages operate.  bnode tokens are local to the
nearest containing GRAPH pattern, or the outermost pattern if none,
whereas variables are global to the whole query.

I have worked on formulating this precisely, but it looks very difficult
and my work is not complete.  I originally thought that we could analyze
query trees into 'paths' (subtrees on which a join is formed); however,
this technique foundered on the case of an OPTIONAL with a UNION
in its second operand.  I believe it is possible but it has eluded me so 
far.
My vision is that every pattern P implies a predicate Pred(P) on mappings,
such that the results of a query on pattern P is {mapping S | Pred(P)(S) }
where the bnode tokens have been pulled to the front of the Pred(P)
as existentially quantified variables.  Pred would be defined recursively,
but the case of UNION inside the second operand of OPTIONAL has
eluded me.

Fred

Received on Monday, 27 November 2006 20:39:59 UTC