Does the expressive power of SPARQL include all forms of default negation?

Hello.

I have a particular class of problems that I'm unable to formulate as a
SPARQL query (using the current standard).  I was hoping somebody might have
a solution and if not, perhaps this is something I can bring up for the WG.
If people think it is worth forwarding there, I can do so - I just wasn't
sure if that was the right forum initially.

Basically, the problem is one where I want to find "all resources that are
not a member of a specific class".  This seems to be especially problematic
for RDF graphs where each resource has *multiple* rdf:type statements.
Consider the graph:

_:b rdf:type D
_:a rdf:type C
_:a rdf:type D
_:a rdf:type B

And we want to find "all resources that are not a member of the C class,"
where the *not* in this case is meant in the closed world sense (i.e.,
basically there is no statement of the form:  ?RESOURCE a C in the graph).
Otherwise, disjointness axioms can be used (i.e., disjointWith,
differentFrom, etc.) to entail membership with the complement of C.

If you were to try using the MINUS ( Difference of graph patterns ) template
introduced in [1] (in order to account for the fact that the simple form
doesn't allow you to use a vanilla OPT/FILTER/!BOUND query):

{ ?RESOURCE a ?KIND } MINUS { ?RESOURCE a C }

You would get the following SPARQL query

SELECT ?RESOURCE ?KIND
{
  ?RESOURCE a ?KIND
  OPTIONAL { 
    ?RESOURCE a C
    ?RESOURCE2 a C 
    FILTER(?RESOURCE2 = ?RESOURCE)
  }
  FILTER(!BOUND(?RESOURCE2))
} 

However you would get the following solution set:

?RESOURCE ?KIND
_:b       D
_:a       D
_:a       B

Which (incorrectly) includes _:a (it is a member of C).  It seems, perhaps
the source of the problem is that most of the relevant algebraic operations
(needed to express the constraint) only apply to individual solution sets at
a time (rather than a multiset of solutions as a whole).

This makes me wonder if indeed SPARQL is as expressive as Datalog with
default negation (minimal, stable, stratified, or otherwise), since my
understanding is that the following query against the equivalent Datalog
program should give the correct answer.

QUERY: not rdf_type(?RESOURCE C)

Since it will simply check for the absence of any literal that 'matches' in
the model. 

Am I missing a simple solution to this class of problems? And if not, will
this problem carry over to the next generation of the query language?

Thanks 

[1] Angles, R. and Gutierrez, C., "The Expressive Power of SPARQL." Springer

----------------------
Chimezie (chee-meh) Thomas-Ogbuji (oh-bu-gee)
Heart and Vascular Institute (Clinical Investigations)
Cleveland Clinic (ogbujic@ccf.org)
Ph.D. Student Case Western Reserve University
(chimezie.thomas-ogbuji@case.edu)


===================================

P Please consider the environment before printing this e-mail

Cleveland Clinic is ranked one of the top hospitals
in America by U.S. News & World Report (2008).  
Visit us online at http://www.clevelandclinic.org for
a complete listing of our services, staff and
locations.


Confidentiality Note:  This message is intended for use
only by the individual or entity to which it is addressed
and may contain information that is privileged,
confidential, and exempt from disclosure under applicable
law.  If the reader of this message is not the intended
recipient or the employee or agent responsible for
delivering the message to the intended recipient, you are
hereby notified that any dissemination, distribution or
copying of this communication is strictly prohibited.  If
you have received this communication in error,  please
contact the sender immediately and destroy the material in
its entirety, whether electronic or hard copy.  Thank you.

Received on Tuesday, 1 September 2009 15:47:24 UTC