Re: comments on SPARQL Query Language for RDF

Hello Lee,

On Jun 4, 2007, at 1300, Lee Feigenbaum wrote:

>
> Bob MacGregor wrote on 05/31/2007 07:52:26 PM:
>
>> With respect to quads, I agree with you.  But it took this discourse
>> with Richard, Pat, and Jeen to convince me
>> that SPARQL has come about as far as triples will allow it.
>
>> I still think that SPARQL ought to have a declarative semantics, and
>> I still think that UNBOUND should not
>> be integral to the language, for the reasons described earlier.
>
> Hi Bob,
>
> I wanted to address these remaining two issues on behalf of the  
> Working
> Group.
>
> To make the trail easier to follow, your original comments message is:
>
> http://lists.w3.org/Archives/Public/public-rdf-dawg-comments/ 
> 2007May/0026.html
>
> I'll quote a bit from that message but please refer back to it for  
> more
> context.
>
> 1. Model-theoretic declarative semantics
>
> """
> The current SPARQL semantics apparently derives from an algebraic
> specification that says what you get when you run a SPARQL query,  
> rather
> than what the answer actually "means".  The SPARQL spec is procedural
> rather than declarative...
> """
>
> In the DAWG's most recently published Last Call working draft, the  
> group
> adopted (as you say) an algebra to specify the semantics of SPARQL  
> query
> patterns and query modifiers. This part of the specification is  
> based on
> the work within The Semantics and Complexity of SPARQL by Perez,  
> Arenas,
> and Gutierrez ( http://arxiv.org/abs/cs/0605124 ).
>
> It's unclear to me if you are unhappy with the presentation of  
> SPARQL's
> semantics or with the semantics itself. If you are not satisfied  
> with the
> presentation of the semantics, the editors would welcome references to
> specific parts of the specification where the text is unclear. We will
> continue to make editorial improvements to the documents during CR.
>
> If you have concerns about the semantics, it would greatly help the
> Working Group if you could share an example of a query, data, and  
> results
> that illustrate your preferred semantics.

My argument is against the choice of an algebraic semantics instead  
of a declarative
semantics.  Unless I am mistaken, OWL has a declarative semantics, and I
would assume that SWIRL and RuleML have or will each have a  
declarative semantics.
Suppose X would like to implement rules from one of these languages  
using
SPARQL to evaluate the rule bodies.  If the semantics of SPARQL  
aligns with
the rule language, or perhaps with a subset of it, then X can  
comfortably use
SPARQL for this task.  However, comparing a declarative (rule)  
semantics with an
algebraic (SPARQL) semantics is an apples and oranges comparison.  To be
sure that SPARQL properly implements the rules, X would have to produce
the declarative semantics on her own.

A declarative semantics forms a bedrock on which to build a logic
pyramid.  An algebraic semantics is essentially a dead-end.

>
> 2. Bound (actually the third point in your original message)
>
> """
> UNBOUND is strictly less expressive than UNSAID (or whatever you  
> may call
> the negation-as-failure) operator.  In Seamark,
> we implement a closed-world version of universal quantification  
> using a
> double negation (e.g., there does not exist value that does not  
> have type
> X).  This construct cannot be emulated using UNBOUND...
> """
>
> Perhaps I don't understand your scenario. This query:
>
> PREFIX : <http://example.org/>
> SELECT DISTINCT ?s
> FROM <http://thefigtrees.net/lee/sw/data/bound.n3> {
>  ?s a ?type .
>  OPTIONAL {
>    ?s2 a :x .
>    FILTER (?s = ?s2)
>  }
>  FILTER (!bound(?s2))
> }
>
> will find all the resources in a dataset that do not have type X. (For
> example, you can run this query at http://sparql.org/sparql.html  
> against
> the dataset at http://thefigtrees.net/lee/sw/data/bound.n3 .)

In our language, this would be

select distinct ?s
from ...
where (?s rdf:type owl:Thing) and
             UNSAID (?s rdf:type X)

The byzantine syntax that you have employed looks like a SPARQL
educator's full-employment act.  Perhaps there is a simpler SPARQL
equivalent?  Either way, it appears that the army of future
SPARQL users is going to find it relatively difficult to say things
that are, for example, easy to state in SQL.

> If you're
> interested in finding out that there are no such resources, you can  
> use an
> ASK query and negate the results in your application. With the ASK  
> query
> form, this query is asking "are there any values that do not (in  
> the RDF
> dataset, under a CWA) have type X?".
>
> Could you outline a use case (or give an example query in your  
> language)
> in which this technique is not sufficient?

An example is "retrieve all persons all of whose children are married".
We can accomplish this in our language as

select ?p
where (?p rdf:type Person) and
             UNSAID ( (?p hasChild ?child) and
                                UNSAID (?child rdf:type MarriedPerson))

The double negation isn't exactly pretty, but given that all  
variables are
existentially quantified, and the query is simulating a closed world
universal quantifier, this isn't too bad.

So the challenge is to express the same thing in SPARQL, in a single
query.  If it can't be done, then this is a demonstration that UNBOUND
is not a substitute for negation as failure.

>
> As a related note, you could assert that ?s was not (universally, CWA)
> bound within the framework of a larger query by using a COUNT  
> aggregate
> function. Aggregate functions are an issue that the DAWG has postponed
> (see http://www.w3.org/2001/sw/DataAccess/issues#countAggregate ) for
> future work.
>
> In any case, the BOUND operator has been part of the SPARQL  
> specification
> for quite some time, and is included in many implementations and  
> regularly
> used by SPARQL users. I would need to see significant new  
> information at
> this point in our schedule to warrant bringing this issue back to  
> the WG.
>

I wasn't recommending eliminating UNBOUND from the language; I was  
recommending
relegating it to secondary status within the language, i.e., making it
a computed predicate and not according it a reserved word.  Its  
easily the
most egregious hack in the language.


>
> thanks,
> Lee
>
>

Cheers, Bob

Bob MacGregor
Chief Scientist
Siderean Software, Inc.
310.647.5690
bmacgregor@siderean.com

Received on Tuesday, 5 June 2007 01:49:57 UTC