RE: Disjunction vs. Optional ... and UNION

-----Original Message-----
From: Bob MacGregor [mailto:bmacgregor@siderean.com] 
Sent: Friday, March 25, 2005 12:30 PM
To: Geoff Chappell; andy.seaborne@hp.com
Cc: public-rdf-dawg-comments@w3.org
Subject: Re: Disjunction vs. Optional ... and UNION

> Geoff and Andy,
> 
> I have to recant somewhat. 

Been there :-)


Here's my current understanding:

- sparql UNION could just as well be called OR (I imagine choice of what to
call it is somewhat a question of whether the language is formally described
with e.g. relational algebra or relational/predicate calculus or..., and
somewhat a matter of taste). 

- sparql OPTIONAL (x) means: x or not x, where not is negation as failure
(or I guess set difference if you prefer). I don't believe there really is
an issue with the semantics when we stop thinking about it as implementers.
If we consider that the vars in a query range over the domain of all
RDF_Terms in the graph (plus I suppose those in the query), and that the
conditions in the query serve to restrict those values, then the semantics
of the query are fixed. From this perspective, NULL values represent
variables that aren't range-restricted by the query (whether or not a var is
range restricted can be determined by converting the query to SOP form - any
variable mentioned in a non-negated pattern element is range-restricted --
at least for that conjunctive leg). 

- sparql is a mixed model language - by that I mean that constraints are
functional, everything else logical (lp-style) (that's why there are two
ANDs, and two ORs). To the lp side of things, a FILTER clause looks like a
single logical relation - e.g.:
	FILTER ?a > 10 && ?b < 12
Can be thought of as:
	filter("?a > 10 && ?b < 12")
where filter will be true if the functional evaluation of the constraints
have an ebv of xs:true, false otherwise. 

Given that:

> (1) Is either the "dumb" version or the "smart" one semantically-valid,
> or is the whole idea untenable?

Both are semantically-valid (x or true vs. x or not x), but the smart one
gives the results we want. 

> (2) If the answer to (1) is "yes", can we produce a semantics
> that conforms to the smart version?

Yes.

> (3) Do there need to be any restrictions on scoping of
> variables to make things work (i.e., as a by-product of
> finding a solution to issue (1))?

I imagine there's a syntactic solution (as opposed to a semantic one), but I
don't believe it's necessary.

> (4) Does the semantics for OPTIONAL (assuming there is one)
> require that we redefine AND and OR (as SPARQL does now),
> or are those issues decouplable? 

I don't think sparql really redefines either (redefines them from what
btw?). It does have two slightly different notions of each. 

> (5) Do we need a special value representing 'unbound'?
> If so, does one value suffice, or do we need many 
> (possibly, the bnode-as-unbound question derives from this one)?

Yes, we need a null value and I believe that NULL != NULL.


> A good starting point would be to investigate whether a
> simple version of OPTIONAL, say one with no nesting,
> no constraints, and only conjunction, has a semantic basis. 

Nesting, disjunctions, and constraints aren't really an issue once the
semantics of OPTIONAL are understood. An algorithm for evaluating a query
according to those semantic might be:

Convert query to SOP form.
Order conjuncts in each product so that: positive triple pattern < filter <
negated triple pattern or filter.
Evaluate each product left to right and union the results.

For example, consider the evil query we've discussed in the past:

select ?x where {?x ?p ?a.  OPTIONAL {?x ?y ?z}. OPTIONAL {?a ?b ?y}}

becomes pretty straight forward when looked at like this:

SELECT ?x ?y ?z WHERE 
(
 	({?a ?b ?y} and {?x ?p ?a} and {?x ?y ?z}) 
		or 
	({?a ?b ?y} and {?x ?p ?a} and not {?x ?y ?z})
		or 
	({?x ?p ?a} and {?x ?y ?z} and not {?a ?b ?y}) 
		or 
	({?x ?p ?a} and not {?a ?b ?y} and not {?x ?y ?z})
)



-Geoff

Received on Saturday, 26 March 2005 15:05:39 UTC