Re: Objections to current SPARQL specification

* Andrew Newman <andrewfnewman@gmail.com> [2007-11-02 07:11+1000]
> 
> Here is a summary of the objections that I've had with the SPARQL
> specification over the years.  Some of my objections are no longer
> relevant and some are more important given the direction the
> specification went.  I also have previously struggled with OPTIONAL
> but am unable to provide a better solution even though I tried to
> develop one using full outer join rather than left outer join.
> 
> I've attempted to get my previously views added to the formal list of
> objections but haven't had much success.

We have certainly been trying to characterize exactly where a language
that satisfies you diverges from what the DAWG has produced. Here is the
text I'm proposing as a summary to the members:

[[
In objection 8, Andrew Newman objects to the early direction of
SPARQL, preferring an RDF query language based on manipulations of
graphs. Such a language would not produce sets of variables, but
instead produce only RDF graphs.

The first document published by the Working Group was the RDF Data
Access Use Cases and Requirements [UCR]. Requirement 3.2, Variable
Binding Results [VBR] requires the extraction of terms from the RDF
graph into a set of solution tuples. No yet-concieved form of a
language meeting the commentor's design goals meets this requirement.

[UCR] http://www.w3.org/TR/2005/WD-rdf-dawg-uc-20050325/
[VBR] http://www.w3.org/TR/2005/WD-rdf-dawg-uc-20050325/#r3.2
]]

> The basis of my objection is founded on SPARQL being an RDF query
> language and that it should use an RDF data model throughout.  This is
> one property that represents what is considered good design for query
> languages (for RDF query languages see [1] but it has been covered
> elsewhere in criticism of other query languages such as SQL).
> 
> One feature that SPARQL lacks is closure.  Having closure on all
> operations means that intermediate results and answers are always tied
> to an RDF graph.  It means that in each step of the query evaluation
> you are dealing with valid subsets of RDF graphs.  The current
> specification, however, reverts to an SQL/multiset/bindings to
> variables that is not compatible with the RDF model.
> 
> To summarize, my objections include [2][3][4][5]:
> * Lack of closure.
> * Inconsistencies between SPARQL triples and the currently defined RDF
> standard (requiring special handling of say CONSTRUCT when there are
> literals as subjects).  If SPARQL was defined in terms of RDF, if RDF
> changed then SPARQL would naturally change.  The current way the
> specification was created seems to allow a difference between the
> language and the data it's querying.
> * The use of multiset semantics instead of semantics consistent with
> RDF (set based semantics).
> * The multiple uses of unbound - you cannot distinguish between a
> result from an OPTIONAL operation or from a variable that is not used
> in the query.  This prevents it from being understood, without
> retaining the original query, what unbound means.
> * Existence of DISTINCT and REDUCED (set based semantics don't have duplicates).
> * Existence of CONSTRUCT (should just be a projection of all columns).
> * Existence of ASK (should just be a projection of no columns giving
> DEE or DUM, T/F).
> * Lack of nadic operators (JOIN, UNION and possibly OPTIONAL).
> * Lack of SUMMARIZE (set based aggregate function).
> 
> [1] J. Bailey et al, "Web and Semantic Web Query Languages: A Survey,"
> LNCS 3564, 2005, Norbert Eisinger, Jan Maluszynski (editor(s)),
> [2] http://lists.w3.org/Archives/Public/public-rdf-dawg-comments/2006Nov/0006.html
> [3] http://jrdf.sourceforge.net/thesis/2006/RelationalBasedSPARQL.html
> [4] http://www.xml.com/lpt/a/1695
> [5] http://lists.w3.org/Archives/Public/public-rdf-dawg-comments/2004Nov/0001.html

-- 
-eric

office: +1.617.258.5741 NE43-344, MIT, Cambridge, MA 02144 USA
mobile: +1.617.599.3509

(eric@w3.org)
Feel free to forward this message to any list for any purpose other than
email address distribution.

Received on Friday, 2 November 2007 02:06:37 UTC