RE: Refining Optionals from Geoff Chappell on 2005-07-06 (public-rdf-dawg@w3.org from July to September 2005)

From: Geoff Chappell <geoff@sover.net>
Date: Wed, 6 Jul 2005 08:35:40 -0400
To: <andy.seaborne@hp.com>
Cc: "'RDF Data Access Working Group'" <public-rdf-dawg@w3.org>
Message-ID: <008a01c58227$3866e510$6501a8c0@gsclaptop>

> -----Original Message-----
> From: Seaborne, Andy [mailto:andy.seaborne@hp.com]
> Sent: Tuesday, July 05, 2005 5:00 AM
> To: Geoff Chappell
> Cc: 'RDF Data Access Working Group'
> Subject: Re: Refining Optionals
> 
[...] 
> 
> 
> FILTERs could be made to attach to a graph pattern but that doesn't really
> chnage anything because that pattern could be group.  FILTERs could be
> attached
> to basic patterns but that would be a chnage in the possible answers (e.g.
> FILTER on a UNION where all branches define the variable tested).
> 

I think something like this might make work.... come up with a set of
rewrite rules to transform the query into something like disjunctive normal
form, but that treats basic patterns with constraints as atomic (call it
disjunctive graph normal form or ...) -  i.e. something like this:

P1 P2 = P2 P1
P1 UNION P2 = P2 UNION P1
OPTIONAL(A) = A UNION NOT(A)
(P1 UNION P2)P3 = P1 P3 UNION P2 P3
NOT(P1 UNION P2) = NOT(P1) NOT(P2)
NOT(P1 NOT(P2)) = NOT(P1) UNION P2

The last rule is really the difference since it separates the positive
factors from the negative without breaking apart the positives (i.e. it
keeps graphs together rather than treating them as a conjunction of triple
patterns and filters). 

The result hopefully is that you don't end up with negated filters in
isolation, but they're instead kept together with the graphs they're
intended to constrain. E.g:

	A OPTIONAL (B FILTER C)

->
	A (B FILTER C UNION NOT(B FILTER C))
->
	A B FILTER C UNION A NOT(B FILTER C)

As opposed to:

	A B FILTER C UNION A NOT(B) UNION A NOT(FILTER C)

which is a case that introduces the unwanted solutions since NOT (FILTER C)
will always succeed if B is needed to bind the vars used in FILTER C (as in
the OPTIONAL-FILTER test case).

So the rules then for determining valid solutions would be (as opposed to a
prescription for how to come up with them):

1. transform into DGNF using rewrite rules
2. each disjunct has a solution set; for each disjunct, any var that is not
mentioned in a positive pattern is unbound for solutions of that disjunct

I haven't tried it in code, but is seems to work on paper :-) 

See any obvious holes in it?

-Geoff

Received on Wednesday, 6 July 2005 12:37:54 UTC