W3C home > Mailing lists > Public > public-rdf-dawg-comments@w3.org > February 2007

Re: Applying the relational model to SPARQL

From: Seaborne, Andy <andy.seaborne@hp.com>
Date: Fri, 02 Feb 2007 19:45:00 +0000
Message-ID: <45C394BC.1020407@hp.com>
To: Bob MacGregor <bmacgregor@siderean.com>
Cc: Eric Prud'hommeaux <eric@w3.org>, Andrew Newman <andrewfnewman@gmail.com>, "public-rdf-dawg-comments@w3.org" <public-rdf-dawg-comments@w3.org>

Bob,

Thank you for the response.

There are some changes coming up that the working group is considering that
will affect some of your examples although not necessarily the main line of 
your argument.

The change is to make the combination of graph patterns based on an algebra
[*] with bottom-up construction of result set, rather than a declarative
restriction on overall results.

An expression like
{ ?book dc:title ?title .
   FILTER (?title = "Harry Potter and the Half-Blood Prince")
}

is evaluated as a restriction over the graph patterns in the group between {} 
(here, a simple basic graph pattern ?book dc:title ?title).

Then the results is these expression are combined with other query.

http://www.sparql.org/query.html does not yet provide these (draft,
unapproved) semantics.  There is a small scale implementation designed to
directly implement the SPARQL algebra in ARQ, but it is not the default query
engine.

Current details:
http://www.w3.org/2001/sw/DataAccess/rq23/rq25-algebra.html
although this text is due to be merged into the overall document
http://www.w3.org/2001/sw/DataAccess/rq23/rq25.html

Looking at the translation from a SPARQL query string to an algebra expression:

----------
PREFIX books:   <http://example.org/book/>
PREFIX dc:      <http://purl.org/dc/elements/1.1/>
SELECT ?book ?title
WHERE
   { ?book dc:title ?title .
      FILTER (?title = "Harry Potter and the Half-Blood Prince")
   }
----------
becomes:

[OpProject ?book ?title
   [OpFilter ( ?title = "Harry Potter and the Half-Blood Prince" )
     [OpBGP ?book dc:title ?title]]]

where the filter restricts the basic graph pattern.

but writing:
{ FILTER (?title = "Harry Potter and the Half-Blood Prince") }
is a restriction of the empty pattern, i.e. it evaluates to no solutions.  the
{} make a difference in the algebra.

----------
PREFIX books:   <http://example.org/book/>
PREFIX dc:      <http://purl.org/dc/elements/1.1/>
SELECT ?book ?title
WHERE
   { ?book dc:title ?title .
     {
      FILTER (?title = "Harry Potter and the Half-Blood Prince")
     }
   }
-------
becomes:

[OpProject ?book ?title
   [OpJoin
     [OpBGP ?book dc:title ?title]
     [OpFilter ( ?title = "Harry Potter and the Half-Blood Prince" )
       (unit table)]
   ]]

the filter, enclosed in {} is evaluated in the context of the {} (the unit
table - no bindings - the empty pattern) and so evaluates to false (?title not 
bound))

Your combining FILTERs to get disjunction would need to be done in the "||" form:
    FILTER
        (?title = "Harry Potter and the Half-Blood Prince" ||
         ?title = "Harry Potter and the Order of the Phoenix")


	Andy


----------------

[*]
A few references for work in this area, as well as the work done within the
working group:

A relational algebra for SPARQL
http://www.hpl.hp.com/techreports/2005/HPL-2005-170.html
Richarh Cyganiak

Semantics and Complexity of SPARQL
   http://arxiv.org/abs/cs.DB/0605124
Jorge Perez, Marcelo Arenas, Claudio Gutierrez

Semantics of SPARQL
   http://ing.utalca.cl/%7Ejperez/papers/sparql_semantics.pdf
Jorge Perez, Marcelo Arenas, Claudio Gutierrez

Preserving SPARQL-to-SQL Query Translation for Optional Graph Patterns
   http://www.cs.wayne.edu/%7Eartem/main/research/TR-DB-052006-CLJF.pdf
Artem Chebotko, Shiyong Lu, Hasan M. Jamil and Farshad Fotouhi


Bob MacGregor wrote:
 >
 >
 > Quite a while ago, I argued against the notion of splitting a SPARQ
 > query into graph and filter components, but lost that battle.  One
 > of the major objections against that split was that the UNION
 > connective was unduly circumscribed in what could be expressed.
 > Since then, I have periodically scanned the "SPARQL Query Language for RDF"
 > document at  http://www.w3.org/TR/rdf-sparql-query/ to see if the
 > situation had improved.  When the change occurred (which it did), I
 > missed it.  Hence, I recently made an erroneous claim on
 > public-rdf-dawg-comments@w3.org that SPARQL was (still) not
 > completely expressive with respect to disjuction.
 >
 > I'm still wondering if the change in SPARQL grammar that increased
 > its expressive power has been accompanied by the (I claim) necessary
 > philosophical shift to account for the difference.  The language
 > syntax has not been revised appropriately, which partially accounts
 > for my failure to detect the change.  Below, I comment on the syntax
 > as it relates to this shift:
 >
 > RDF has been conceived as different from the predicate calculus in
 > that it deals with *graphs* rather than with arbitrary logical
 > expressions.  In the predicate calculus, one can make, for example,
 > non-graph-like disjunctive assertions; in RDF we can't.
 >
 > SPARQL was initially conceived, as nearly as I can tell, as a
 > "graph query language".  The UNION operator was an "algebraic"
 > operation that took in two graphs and produced a third.  This is no
 > longer the case.  Here is a counter example that I executed on the
 > SPARQLer website at  http://www.sparql.org/query.html :
 >
 > PREFIX books:   <http://example.org/book/>
 > PREFIX dc:      <http://purl.org/dc/elements/1.1/>
 > SELECT ?book ?title
 > WHERE
 >   { ?book dc:title ?title .
 >     {
 >      FILTER (?title = "Harry Potter and the Half-Blood Prince")
 >     }
 >   UNION
 >     {
 >      FILTER (?title = "Harry Potter and the Order of the Phoenix")
 >     }
 >   }
 >
 > This query unions two filter clauses, so clearly UNION is no longer
 > (just) a graph operator.
 >
 > I was told (quite) a while back that UNION was not a disjunction
 > connective, i.e., it was not equivalent to the traditional "OR"
 > in the predicate calculus.  I didn't really understand the distinction
 > at the time, but later theorized that perhaps it was this the
 > difference between a graph operator and an expression operator.
 > If so, that distinction is no longer valid.
 >
 > The above query can also be expressed as follows:
 >
 > PREFIX books:   <http://example.org/book/>
 > PREFIX dc:      <http://purl.org/dc/elements/1.1/>
 > SELECT ?book ?title
 > WHERE
 >   { ?book dc:title ?title
 >     FILTER
 >             (?title = "Harry Potter and the Half-Blood Prince" ||
 >              ?title = "Harry Potter and the Order of the Phoenix")
 >   }
 >
 > Once upon a time, the "||" operator was the only way to achieve
 > a disjunction within a FILTER expression.  Now, the "||" is
 > redundant; the language is equally expressive if we eliminate it
 > from the grammar.  The same is true for the "&&" operator.  The
 > query
 >
 > PREFIX books:   <http://example.org/book/>
 > PREFIX dc:      <http://purl.org/dc/elements/1.1/>
 > SELECT ?book ?title
 > WHERE
 >   { ?book dc:title ?title
 >     FILTER (
 >             regex(?title, "Harry Potter") &&
 >             regex(?title, "Phoenix")
 >            )
 >   }
 >
 > can also be expressed as
 >
 > PREFIX books:   <http://example.org/book/>
 > PREFIX dc:      <http://purl.org/dc/elements/1.1/>
 > SELECT ?book ?title
 > WHERE
 >   { ?book dc:title ?title .
 >     {
 >     FILTER regex(?title, "Harry Potter")
 >     }
 >    .
 >     {
 >      FILTER regex(?title, "Phoenix")
 >     }
 >   }
 >
 > I would claim that a well-designed language should not include
 > redundant operators, and that the primary justification for
 > including "||" and "&&" is now historical.  I would also claim
 > that the original justification for choosing the term "UNION" in
 > preference to the term "OR" is now gone (but perhaps there is
 > a wrinkle that I'm not aware of?).  And if I were king, I would
 > use the term "AND" in preference to ".".
 >
 > What we have in SPARQL is a language that has evolved, but the
 > syntax has not uniformly evolved with it.
 >
 > The term "FILTER" is also quite unfortunate.  I have been told
 > that there has been a deliberate decision to forbid a syntax
 > that allows query variables to be bound to literal values that do not
 > appear within the underlying RDF store.  Our own query processor does
 > not have that restriction, and we have any number of use cases in our
 > own applications that depend on the ability to generate synthetic
 > literals.  The SPARQL 'funcall' invokes a predicate that returns
 > a boolean value.  Our equivalent of funcall can return arbitrary
 > values.  In our library of ~40 system-defined funcall IRIs, the functions
 > outnumber the predicates by more than 3-to-1.
 >
 > The notion of "filter" is in fact too limiting; the additional
 > power that accompanies the ability to synthesize new literal values
 > (and allowing them to be bound to SELECT variables) is huge, and
 > languages that go beyond the filter notion are going to dominate
 > those that don't.
 >
 > Summarizing.  The notion of SPARQL as a graph language, rather than
 > as a calculus language, is already obsolete.  The syntax of the language
 > has not been upgraded to accomodate that conceptual shift.  Instead,
 > there is an artificial syntactic barrier between the
 > "graph" portion of the language and the non-graph portion.  SQL and
 > Common Logic provide examples of logic languages that did not make
 > that distinction, and as a result are much cleaner, and much more
 > readable.
 >
 > Cheers, Bob
Received on Friday, 2 February 2007 19:45:26 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 8 January 2008 14:14:51 GMT