Re: A neat, but impractical, solution from Bijan Parsia on 2006-08-07 (public-rdf-dawg@w3.org from July to September 2006)

From: Bijan Parsia <bparsia@cs.man.ac.uk>
Date: Mon, 7 Aug 2006 21:22:42 +0100
To: Pat Hayes <phayes@ihmc.us>
Cc: RDF Data Access Working Group <public-rdf-dawg@w3.org>
Message-Id: <2A27A3F7-D4E1-465A-8698-896E5BEAB1EC@cs.man.ac.uk>
On Aug 7, 2006, at 7:51 PM, Pat Hayes wrote:

> (I may not be following your thinking, and I think its because Im  
> not sure of the exact meaning you are assuming for this  
> 'distinguished' terminology. Definitions??)
>
>> On Aug 7, 2006, at 5:40 AM, Pat Hayes wrote:
>>
>>>> It occurs to me that one way to manage the distinguished/ 
>>>> semidistinguished/nondistinguished mechanism a bit more neatly  
>>>> would be to dispense with the distinction between BNodes and  
>>>> query variables (except for spelling and syntactic restrictions  
>>>> on placements) in queries, and force the range of the variables  
>>>> to depend on the query form. That is, SELECT forces  
>>>> distinguished variables, and ASK the rest.
>>>
>>> So there are no bnodes in SELECT patterns, if I follow you (?)  
>>> That seems like a big handicap.
>>
>> You could have them or not. You could let them be nondistinguished  
>> or not.

I'm proposing to slightly change the meaning of ASK. So to avoid  
confusion, let me just propose two query forms:

SELECT	 variables in the head are distinguished.  (thus, select is  
not a projection)
SELECT* variables in the head are semi-distinguished

Both SELECT and SELECT* can have *NO* variables in the head, which  
makes them like an ASK query, except for the difference in how the  
variables are interpreted.

(By the head of a SELECT, I mean the list of variables that determine  
what variables get into the answer set.)

Orthogonal issue: BNodes in the query.

Option 1: Dispense with them. They aren't needed and they are confusing.
Option 2: Keep them, but always interpret them as non-distinguished
Option 3: Keep them, but treat them as distinguished or not as the  
form indicates, and let them appear in the head.


> ?? If we dispense with the distinction between bnodes and  
> variables, then there is only one category, lets call them  
> variables. Now, if SELECT forces these variables to be  
> distinguished, then they aren't bnodes, right? Because bnodes can't  
> possibly be distinguished (can they?)

In option three, basically BNodes become strangely spelt query  
variables.

> So taking this together, there aren't any bnodes in a SELECT.

By dispense with the distinction, I meant option 3.

> What am I not getting?
>
>> This is the not neat part. Either you let them be distinguished,  
>> which is a bit weird, or you no longer have the nice divide betwen  
>> ASK and SELECT.
>
> My understanding of this difference was basically that ASK doesn't  
> (need to) return any answer bindings, which I thought was motivated  
> largely to keep down internet traffic when asking simple questions  
> against large KBs.

Sorry, I was introducing a new divide. The normal ASK behavior is  
recovered by having an empty head.

>>>> You could list BNodes in the head just like other query  
>>>> variables, or dispense with them altogether, or allow them to  
>>>> have their present form, to wit, being dedicately non- 
>>>> distinguished.
>>>
>>> Actually I don't think that is their current role in SPARQL, if I  
>>> understand what you mean by non-distinguished.
>>
>> Hmm, I thought that they are non-distinguished. They can never  
>> appear in the head of the query
>
> Maybe Im not following you (again), but what in SPARQL are you  
> assuming is analogous to the head? Ive been assuming it was the  
> query pattern. But this can contain bnodes.

No, it's the list of variables reported back. The query pattern is  
the body.

>> (the pure datalog sense) and they can, in RDF terms, be bound to  
>> arbitrary entities.
>
> To terms in the KB. Why does that make them non-distinguished?

They can be bound to individuals not in the active domain. I know  
that in private discussion we talk about how BNodes in the graph can  
be skolemized, i.e., thought of as in the active domain for some  
purposes, but in general I give them an existential reading. This is  
why I want the tri-partite distinction. This means that we are  
talking about the variables in the same way whether in RDF or OWL.

>> Hmm. The reason we treat them as non-distinguished in Pellet is  
>> because I remember Enrico telling me that they were :) Can I  
>> derived this from the framework?
>>
>> Ah yes, I can. The bit is that only query variables are restricted  
>> by the scoping set.
>
> Are we talking RDF or OWL? The general definitions let the scoping  
> set be arbitrary, so this isn't in itself a restriction of any  
> kind. (It becomes one when you specify the scoping set for a  
> particular entailment regime, but Im not sure which one you have in  
> mind here.)

WIth SELECT, the scoping set will be just the URIs and Literals. With  
SELECT* you add BNodes.

>> So, I do think no matter how you see nondistinguished variables,  
>> that they are always nondistinguished.
>>
>>>> Unfortunately, while rather neat, it's not very practical, as  
>>>> people are used to using SELECT as their query form (a la SQL)  
>>>> and, especially in the RDF case, likely to want semi- 
>>>> distinguished variables by default.
>>>
>>> Right. I would strongly oppose restricting variable bindings in  
>>> RDF SPARQL: there is no computational need to do so,
>>
>> Well, there are issues for when you want to supply non-redundant  
>> answers, as I've pointed out before.
>
> Of course, but I would prefer to relax the requirement of non- 
> redundancy

I'm not required non-redundancy (as I've said before), I just require  
that it be possible for formulate a query where you get non- 
redundancy answers. With DISTINCT.

All this is moot since I conceded the impracticality from the start.

Cheers,
Bijan.
Received on Monday, 7 August 2006 20:21:58 UTC