W3C home > Mailing lists > Public > public-rdf-dawg@w3.org > January to March 2006

[Fwd: major technical: no subqueries]

From: Dan Connolly <connolly@w3.org>
Date: Thu, 12 Jan 2006 17:21:00 -0600
To: RDF Data Access Working Group <public-rdf-dawg@w3.org>
Message-Id: <1137108060.19546.438.camel@dirk.w3.org>
This seems like a reasonably coherent argument for a new requirement,
complete with rationale and use case.

If you support this requirement and would like to see us add it
to the critical path, please say so.

If not, please help me come up with justification that might
satisfy the commentor.

-- 
Dan Connolly, W3C http://www.w3.org/People/Connolly/
D3C2 887B 0F92 6005 C541  0875 0F91 96DE 6E52 C29E

attached mail follows:



Section 10.3.2 "Accessing graphs in the RDF dataset"
observes that it is possible to extract subgraphs of the
input graphs using elementary CONSTRUCT queries.  Once a user does
this, he may presumably direct the output to some storage medium,
assign an IRI to it and then run a query against that extract.
Or with the right operating system interface, he might be able to
"pipe" the output of a CONSTRUCT into the FROM clause of another
SPARQL query. 
It would be useful to avoid the need for explicitly storing or
piping the result before performing further queries on it.  One way to
do this would be to extend the FROM clause to permit a CONSTRUCT
query as either the default graph or a named graph, for
example SELECT * FROM ( CONSTRUCT ... ) ...

This is of course analogous to subqueries and in-line views in SQL. 
The originators of SQL mistakenly believed that they did not need
subqueries, so subqueries were not part of the original design.

In the case of SPARQL, perhaps it is true that any query that could be
written with a
CONSTRUCT in the FROM clause could be rewritten to avoid it.
However, experience in SQL and other languages show that it is still a good
idea to permit composability wherever it makes sense semantically,
and leave it to the implementation to find the optimization.

One scenario in which users will want a CONSTRUCT nested in a FROM clause
is as follows: a user has access to a vast and time-varying input
graph, containing a lot of data that is not of interest to the user.
The user has learned from experience how to extract the portion relevant
to his interests using a CONSTRUCT.  Then the user wishes to refine
his view of the graph further.  For this purpose, he wants to just
cut-and-paste a CONSTRUCT query that he has debugged into his ad hoc
queries.

I also advocate another kind of subquery: allow an ASK as a boolean
expression.  This will provide an alternative way to formulate
non-existence queries.  For example, the query to find people with no
dc:date in section 11.2.3.1, currently written as:

PREFIX foaf: <http://xmlns.com/foaf/0.1>
PREFIX dc: <http://purl.org/dc/element/1.1>
SELECT ?name
WHERE { ?x foaf:givenName ?name .
        OPTIONAL { ?x dc:date ?date } .
        FILTER (!bound(?date)) }

could be expressed:

SELECT ?name
WHERE { ?x foaf:givenName ?name .
        FILTER ( ! ( ASK ?date WHERE { ?x dc:date ?date } ) ) }

I think that some might find the formulation using ASK more intuitive.
(I know, some might disagree.)
Another argument in favor of nested ASK is that it lends itself to
building queries incrementally, from separately debugged pieces.

Fred Zemke
Received on Thursday, 12 January 2006 23:21:09 GMT

This archive was generated by hypermail 2.3.1 : Tuesday, 26 March 2013 16:15:25 GMT