Re: "Unbound" in SPARQL (was Re: [TF-LIB] COALESCE is an unhelpful choice of name) from Steve Harris on 2009-11-18 (public-rdf-dawg@w3.org from October to December 2009)

From: Steve Harris <steve.harris@garlik.com>
Date: Wed, 18 Nov 2009 10:13:36 +0000
To: Andy Seaborne <andy.seaborne@talis.com>
Cc: "public-rdf-dawg@w3.org Group" <public-rdf-dawg@w3.org>
Message-Id: <08BBA5DD-0BD3-4190-90DC-988C20B77F42@garlik.com>

On 17 Nov 2009, at 13:47, Andy Seaborne wrote:
> On 17/11/2009 12:28, Steve Harris wrote:
>> On 17 Nov 2009, at 12:03, Andy Seaborne wrote:
>>> On 17/11/2009 11:44, Steve Harris wrote:
>>>>
>>>> - Steve, hoping sparql:noValue wasn't supposed to be a URI :)
>>>
>>> It can be a URI :) The other alternative is a literal that is in a
>>> value space disjoint from all others.
>>
>> What about
> >
>> SELECT ?x
>> WHERE {
>> ?x :p :o .
>> OPTIONAL {
>> ?x :doesnotexist ?y .
>> }
>> FILTER(?y = sparql:noValue)
>> }
>>
>> v's
>>
>> SELECT * WHERE {
>> { SELECT (MAX(?y) AS ?max)
>> WHERE {
>> ?x :p :o .
>> OPTIONAL {
>> ?x :doesnotexist ?y .
>> }
>> } GROUP BY ?x }
>> FILTER(?max = sparql:noValue) }
>
>
> They are different. (And independent of URI vs literal)
>
> I'm also not against MAX(empty) being an error.  This seems the  
> simpler design and most easily communicatable.

Yeah, I'm not quite sure that an error is correct, but I can see how  
you could convince yourself that it was right.

> You can use TRY/COALESCE to turn errors into some other value (also  
> needed if you have an unwritable value).

Yep.

> Short forms coudl be provided, if we think it's convenient, for  
> application specific defaults : MAX(?x, emptyValue)

Well, that's what COALESCE would do by SQL semantics.

> (This use of TRY/COALESCE is what you were proposing for handling  
> xsd casts)

Yes.

> I don't have a problem with the fact these two are different. In SQL  
> NULL does not join with NULL unless you provide an explicit join  
> condition.  Once NULL appear in SQL, the SQL to cope with natural  
> usages of two OPTIONAL, one after the other, is not natural and it's  
> right we did not impose the same burden on application writers.

I don't follow that sentence.

>> It potentially has the opposite confusion problem to SQLs infamous  
>> WHERE
>> x = NULL.
>
> The constraints here are:
> 1/ XSD expression evaluation
> 2/ The simpler use cases of two OPTIONALs that drove the design of  
> SPARQL 1.0.
>
> We could do a superset of XSD, mod the restriction of SPARQL 1.0  
> compatibility, but I think we'd get in trouble from the different  
> sources of NULL (the old there are X types of SQL NULL anbd they are  
> different).

Really? I think there's only one logically speaking. Isn't that one of  
Date's criticisms, IIRC he thinks there should have been two distinct  
"NULL"s.

> Having complexity in the expressions, for a pattern matching  
> language, is the right balance to consider.
>
>> It might be easier to make it same value that can't be written down,
>> then users can't distinguish it from the "unbound" value.
>>
>> - Steve
>
> How is it serialized into XML results?

Same as unbound values currently.

> (Keeping it internal breaks federated query)
>
> If you want an ideal solution, it probably means that every instance  
> is distinct and unique.  Then undef != undef.

Yes, which is the source of my concern around giving it a URI.

Anyway, this has got a bit philosophical. Verbiage, algebra and  
testcases should distil all this into some requirements.

- Steve

-- 
Steve Harris, CTO, Garlik Limited
2 Sheen Road, Richmond, TW9 1AE, UK
+44(0)20 8973 2465  http://www.garlik.com/
Registered in England and Wales 535 7233 VAT # 849 0517 11
Registered office: Thames House, Portsmouth Road, Esher, Surrey, KT10  
9AD

Received on Wednesday, 18 November 2009 10:14:05 UTC