- From: Andrew Newman <andrewfnewman@gmail.com>
- Date: Fri, 2 Nov 2007 07:11:02 +1000
- To: public-rdf-dawg-comments@w3.org
Here is a summary of the objections that I've had with the SPARQL specification over the years. Some of my objections are no longer relevant and some are more important given the direction the specification went. I also have previously struggled with OPTIONAL but am unable to provide a better solution even though I tried to develop one using full outer join rather than left outer join. I've attempted to get my previously views added to the formal list of objections but haven't had much success. The basis of my objection is founded on SPARQL being an RDF query language and that it should use an RDF data model throughout. This is one property that represents what is considered good design for query languages (for RDF query languages see [1] but it has been covered elsewhere in criticism of other query languages such as SQL). One feature that SPARQL lacks is closure. Having closure on all operations means that intermediate results and answers are always tied to an RDF graph. It means that in each step of the query evaluation you are dealing with valid subsets of RDF graphs. The current specification, however, reverts to an SQL/multiset/bindings to variables that is not compatible with the RDF model. To summarize, my objections include [2][3][4][5]: * Lack of closure. * Inconsistencies between SPARQL triples and the currently defined RDF standard (requiring special handling of say CONSTRUCT when there are literals as subjects). If SPARQL was defined in terms of RDF, if RDF changed then SPARQL would naturally change. The current way the specification was created seems to allow a difference between the language and the data it's querying. * The use of multiset semantics instead of semantics consistent with RDF (set based semantics). * The multiple uses of unbound - you cannot distinguish between a result from an OPTIONAL operation or from a variable that is not used in the query. This prevents it from being understood, without retaining the original query, what unbound means. * Existence of DISTINCT and REDUCED (set based semantics don't have duplicates). * Existence of CONSTRUCT (should just be a projection of all columns). * Existence of ASK (should just be a projection of no columns giving DEE or DUM, T/F). * Lack of nadic operators (JOIN, UNION and possibly OPTIONAL). * Lack of SUMMARIZE (set based aggregate function). [1] J. Bailey et al, "Web and Semantic Web Query Languages: A Survey," LNCS 3564, 2005, Norbert Eisinger, Jan Maluszynski (editor(s)), [2] http://lists.w3.org/Archives/Public/public-rdf-dawg-comments/2006Nov/0006.html [3] http://jrdf.sourceforge.net/thesis/2006/RelationalBasedSPARQL.html [4] http://www.xml.com/lpt/a/1695 [5] http://lists.w3.org/Archives/Public/public-rdf-dawg-comments/2004Nov/0001.html
Received on Thursday, 1 November 2007 21:11:12 UTC