- From: Chimezie Ogbuji <ogbujic@ccf.org>
- Date: Tue, 01 Sep 2009 11:42:20 -0400
- To: "public-sparql-dev@w3.org" <public-sparql-dev@w3.org>
Hello. I have a particular class of problems that I'm unable to formulate as a SPARQL query (using the current standard). I was hoping somebody might have a solution and if not, perhaps this is something I can bring up for the WG. If people think it is worth forwarding there, I can do so - I just wasn't sure if that was the right forum initially. Basically, the problem is one where I want to find "all resources that are not a member of a specific class". This seems to be especially problematic for RDF graphs where each resource has *multiple* rdf:type statements. Consider the graph: _:b rdf:type D _:a rdf:type C _:a rdf:type D _:a rdf:type B And we want to find "all resources that are not a member of the C class," where the *not* in this case is meant in the closed world sense (i.e., basically there is no statement of the form: ?RESOURCE a C in the graph). Otherwise, disjointness axioms can be used (i.e., disjointWith, differentFrom, etc.) to entail membership with the complement of C. If you were to try using the MINUS ( Difference of graph patterns ) template introduced in [1] (in order to account for the fact that the simple form doesn't allow you to use a vanilla OPT/FILTER/!BOUND query): { ?RESOURCE a ?KIND } MINUS { ?RESOURCE a C } You would get the following SPARQL query SELECT ?RESOURCE ?KIND { ?RESOURCE a ?KIND OPTIONAL { ?RESOURCE a C ?RESOURCE2 a C FILTER(?RESOURCE2 = ?RESOURCE) } FILTER(!BOUND(?RESOURCE2)) } However you would get the following solution set: ?RESOURCE ?KIND _:b D _:a D _:a B Which (incorrectly) includes _:a (it is a member of C). It seems, perhaps the source of the problem is that most of the relevant algebraic operations (needed to express the constraint) only apply to individual solution sets at a time (rather than a multiset of solutions as a whole). This makes me wonder if indeed SPARQL is as expressive as Datalog with default negation (minimal, stable, stratified, or otherwise), since my understanding is that the following query against the equivalent Datalog program should give the correct answer. QUERY: not rdf_type(?RESOURCE C) Since it will simply check for the absence of any literal that 'matches' in the model. Am I missing a simple solution to this class of problems? And if not, will this problem carry over to the next generation of the query language? Thanks [1] Angles, R. and Gutierrez, C., "The Expressive Power of SPARQL." Springer ---------------------- Chimezie (chee-meh) Thomas-Ogbuji (oh-bu-gee) Heart and Vascular Institute (Clinical Investigations) Cleveland Clinic (ogbujic@ccf.org) Ph.D. Student Case Western Reserve University (chimezie.thomas-ogbuji@case.edu) =================================== P Please consider the environment before printing this e-mail Cleveland Clinic is ranked one of the top hospitals in America by U.S. News & World Report (2008). Visit us online at http://www.clevelandclinic.org for a complete listing of our services, staff and locations. Confidentiality Note: This message is intended for use only by the individual or entity to which it is addressed and may contain information that is privileged, confidential, and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient or the employee or agent responsible for delivering the message to the intended recipient, you are hereby notified that any dissemination, distribution or copying of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and destroy the material in its entirety, whether electronic or hard copy. Thank you.
Received on Tuesday, 1 September 2009 15:47:24 UTC