Re: Objections to current SPARQL specification from Andrew Newman on 2007-11-02 (public-rdf-dawg-comments@w3.org from November 2007)

From: Andrew Newman <andrewfnewman@gmail.com>
Date: Fri, 2 Nov 2007 10:39:06 +1000
To: "Pat Hayes" <phayes@ihmc.us>
Cc: public-rdf-dawg-comments@w3.org
Message-ID: <2db5a5c40711011739l6a910268i3d861ecec990268a@mail.gmail.com>
On 11/2/07, Pat Hayes <phayes@ihmc.us> wrote:
> >One feature that SPARQL lacks is closure.  Having closure on all
> >operations means that intermediate results and answers are always tied
> >to an RDF graph.  It means that in each step of the query evaluation
> >you are dealing with valid subsets of RDF graphs.
>
> Can you expand on this notion? As I don't quite understand it as
> stated here. Do you mean that the result of a query should itself be
> an RDF graph? If so, how does one keep track of the bindings to query
> variables? For example if my query is (artificial example)
> ?x rdf:type ?y
> then given a graph, that is, a set of triples, how does one know
> which URIs in it are bound to ?x and which to ?y ?
>

I'm saying that not only should the result of a SPARQL query be a
valid RDF graph (a subgraph of the graph(s) you queried) but that all
operations should be defined in terms of RDF.  So the input of a
SPARQL operation is RDF and it produces RDF.

I think there are probably many ways you can track these things.  The
one I developed was by creating a relational model of RDF where you
had attributes, types, tuples and relations and re-use relational
operations (which are closely aligned with SPARQL operations).  But
I'm sure you could create a graph model and graph operations.  See:
http://jrdf.sourceforge.net/thesis/2006/RelationalBasedSPARQL.html#Mapping

> >  The current
> >specification, however, reverts to an SQL/multiset/bindings to
> >variables that is not compatible with the RDF model.
>
> In what sense is it not compatible? The connection seems clear: the
> binding is a mapping from a query to an RDF graph. What objection do
> you have to this formulation?
>

Bags aren't sets.  In RDF's data model there isn't this problem of
duplicated data
or normalization as there is in SPARQL's or SQL's.  SPARQL has the
idea of matching statements in the graph.  From my understanding,
RDF's data model doesn't support the
idea of multiple subject, predicates and/or objects with the same values.

> >To summarize, my objections include [2][3][4][5]:
> >* Lack of closure.
> >* Inconsistencies between SPARQL triples and the currently defined RDF
> >standard (requiring special handling of say CONSTRUCT when there are
> >literals as subjects).  If SPARQL was defined in terms of RDF, if RDF
> >changed then SPARQL would naturally change.  The current way the
> >specification was created seems to allow a difference between the
> >language and the data it's querying.
>
> Allowing literals as subjects is a modification to RDF which has
> itself almost become a de facto standard, and is widely used. But the
> SPARQL specification can be applied to the subcase in which literal
> subjects are illegal without modification, if one prefers.
>

I'm not sure I understand the position of a standards group pushing de
facto standards.

And like I said - if you stuck with SPARQL being compatible with RDF
and then you could change RDF and show what you can now do with
SPARQL.  The whole mapping of SPARQL to RDF that occurs in CONSTRUCT
operations wouldn't be needed.

There is now a SPARQL data model and an RDF data model.  Is there
forever going to be a separation or are the two going to align
someday?

> >* The use of multiset semantics instead of semantics consistent with
> >RDF (set based semantics).
>
> "Set-based" is too vague. RDF graphs are defined as sets of
> *triples*, but query results are bindings. In general, one cannot
> guarantee that a set of triples will produce a set of bindings. For
> example, the query
>
> ex:a ex:p ?y
>
> against the graph
>
> ex:a ex:p ex:B .
> ex:a ex:p ex:C .
>

You're right - that's why I'm saying don't use bindings!

I assume you mean that the query is:
select ?y
where { ex:a ex:p ?y }

Before projecting, the result the operations is something like:
s1:subject p1:predicate ?y:object
ex:a ex:p ex:B
ex:a ex:p ex:C

Therefore, if you want the graph back you just project on the empty
set - getting you back your graph.
Received on Friday, 2 November 2007 00:39:17 UTC