RE: Applying the relational model to SPARQL

Bob MacGregor wrote on 02/06/2007 10:06:36 PM:

> It would be helpful (to me at least) if the SPARQL documents include
> explicit mention of the *assumptions* that underlie the design of 
> the language.  If I knew what they were, then for example, I could 
> guess as to whether SPARQL might someday allow variables to bind to 
> values not found in the database (a feature we rely on quite heavily
> with our own RDF query language) or if the committee is deliberately
> designing itself into a corner. 

Hi Bob,

The assumptions that underlie the design of the SPARQL query language are 
captured within the Working Group's Use Cases and Requirements document, 
which you can find at:
  http://www.w3.org/TR/rdf-dawg-uc/

Considering your specific example: The Use Cases and Requirements do not 
require variables to be bound to expressions in outside of the RDF 
Dataset. In the past, there has never been a critical mass of Working 
Group members to consider adopting this requirement. Consensus in the 
working group surrounding this issue, captured recently in this mailing 
list thread: 
http://lists.w3.org/Archives/Public/public-rdf-dawg/2006OctDec/0077.html, 
seemed to be that the current design does not preclude future work in this 
area, and that in the absence of a requirement to include expressions in 
the SELECT list, the risk to our schedule would be too large to make this 
change at this time. 

You are welcome to suggest new use cases which might drive new 
requirements for the Working Group. If the Working Group finds new 
information that it has not previously considered, the group will then 
decide whether or not to adopt new requirements. Please be aware that the 
Working Group is hoping to advance to Last Call within a few weeks, and at 
this point in the process the requirements of our schedule do carry a good 
deal of significance in our decisions. 

thanks,
Lee

> 
> Again, I'm hoping that I'm totally mistaken here in my 
> interpretation of SPARQL semantics.
> 
> Regards, Bob
> 
> -----Original Message-----
> From: Seaborne, Andy [mailto:andy.seaborne@hp.com] 
> Sent: Friday, February 02, 2007 11:45
> To: Bob MacGregor
> Cc: Eric Prud'hommeaux; Andrew Newman; public-rdf-dawg-comments@w3.org
> Subject: Re: Applying the relational model to SPARQL
> 
> Bob,
> 
> Thank you for the response.
> 
> There are some changes coming up that the working group is 
> considering that will affect some of your examples although not 
> necessarily the main line of your argument.
> 
> The change is to make the combination of graph patterns based on an 
> algebra [*] with bottom-up construction of result set, rather than a
> declarative restriction on overall results.
> 
> An expression like
> { ?book dc:title ?title .
>    FILTER (?title = "Harry Potter and the Half-Blood Prince") }
> 
> is evaluated as a restriction over the graph patterns in the group 
> between {} (here, a simple basic graph pattern ?book dc:title ?title).
> 
> Then the results is these expression are combined with other query.
> 
> http://www.sparql.org/query.html does not yet provide these (draft,
> unapproved) semantics.  There is a small scale implementation 
> designed to directly implement the SPARQL algebra in ARQ, but it is 
> not the default query engine.
> 
> Current details:
> http://www.w3.org/2001/sw/DataAccess/rq23/rq25-algebra.html
> although this text is due to be merged into the overall document 
> http://www.w3.org/2001/sw/DataAccess/rq23/rq25.html
> 
> Looking at the translation from a SPARQL query string to an algebra 
> expression:
> 
> ----------
> PREFIX books:   <http://example.org/book/>
> PREFIX dc:      <http://purl.org/dc/elements/1.1/>
> SELECT ?book ?title
> WHERE
>    { ?book dc:title ?title .
>       FILTER (?title = "Harry Potter and the Half-Blood Prince")
>    }
> ----------
> becomes:
> 
> [OpProject ?book ?title
>    [OpFilter ( ?title = "Harry Potter and the Half-Blood Prince" )
>      [OpBGP ?book dc:title ?title]]]
> 
> where the filter restricts the basic graph pattern.
> 
> but writing:
> { FILTER (?title = "Harry Potter and the Half-Blood Prince") } is a 
> restriction of the empty pattern, i.e. it evaluates to no solutions.
> the {} make a difference in the algebra.
> 
> ----------
> PREFIX books:   <http://example.org/book/>
> PREFIX dc:      <http://purl.org/dc/elements/1.1/>
> SELECT ?book ?title
> WHERE
>    { ?book dc:title ?title .
>      {
>       FILTER (?title = "Harry Potter and the Half-Blood Prince")
>      }
>    }
> -------
> becomes:
> 
> [OpProject ?book ?title
>    [OpJoin
>      [OpBGP ?book dc:title ?title]
>      [OpFilter ( ?title = "Harry Potter and the Half-Blood Prince" )
>        (unit table)]
>    ]]
> 
> the filter, enclosed in {} is evaluated in the context of the {} 
> (the unit table - no bindings - the empty pattern) and so evaluates 
> to false (?title not
> bound))
> 
> Your combining FILTERs to get disjunction would need to be done in 
> the "||" form:
>     FILTER
>         (?title = "Harry Potter and the Half-Blood Prince" ||
>          ?title = "Harry Potter and the Order of the Phoenix")
> 
> 
>    Andy
> 
> 
> ----------------
> 
> [*]
> A few references for work in this area, as well as the work done 
> within the working group:
> 
> A relational algebra for SPARQL
> http://www.hpl.hp.com/techreports/2005/HPL-2005-170.html
> Richarh Cyganiak
> 
> Semantics and Complexity of SPARQL
>    http://arxiv.org/abs/cs.DB/0605124
> Jorge Perez, Marcelo Arenas, Claudio Gutierrez
> 
> Semantics of SPARQL
>    http://ing.utalca.cl/%7Ejperez/papers/sparql_semantics.pdf
> Jorge Perez, Marcelo Arenas, Claudio Gutierrez
> 
> Preserving SPARQL-to-SQL Query Translation for Optional Graph Patterns
>    http://www.cs.wayne.edu/%7Eartem/main/research/TR-DB-052006-CLJF.pdf
> Artem Chebotko, Shiyong Lu, Hasan M. Jamil and Farshad Fotouhi
> 
> 
> Bob MacGregor wrote:
>  >
>  >
>  > Quite a while ago, I argued against the notion of splitting a 
> SPARQ  > query into graph and filter components, but lost that 
> battle.  One  > of the major objections against that split was that 
> the UNION  > connective was unduly circumscribed in what could be 
expressed.
>  > Since then, I have periodically scanned the "SPARQL Query Language 
for RDF"
>  > document at  http://www.w3.org/TR/rdf-sparql-query/ to see if the
> > situation had improved.  When the change occurred (which it did), 
> I  > missed it.  Hence, I recently made an erroneous claim on  > 
> public-rdf-dawg-comments@w3.org that SPARQL was (still) not  > 
> completely expressive with respect to disjuction.
>  >
>  > I'm still wondering if the change in SPARQL grammar that 
> increased  > its expressive power has been accompanied by the (I 
> claim) necessary  > philosophical shift to account for the 
> difference.  The language  > syntax has not been revised 
> appropriately, which partially accounts  > for my failure to detect 
> the change.  Below, I comment on the syntax  > as it relates to this 
shift:
>  >
>  > RDF has been conceived as different from the predicate calculus 
> in  > that it deals with *graphs* rather than with arbitrary logical
> > expressions.  In the predicate calculus, one can make, for 
> example,  > non-graph-like disjunctive assertions; in RDF we can't.
>  >
>  > SPARQL was initially conceived, as nearly as I can tell, as a  > 
> "graph query language".  The UNION operator was an "algebraic"
>  > operation that took in two graphs and produced a third.  This is 
> no  > longer the case.  Here is a counter example that I executed on
> the  > SPARQLer website at  http://www.sparql.org/query.html :
>  >
>  > PREFIX books:   <http://example.org/book/>
>  > PREFIX dc:      <http://purl.org/dc/elements/1.1/>
>  > SELECT ?book ?title
>  > WHERE
>  >   { ?book dc:title ?title .
>  >     {
>  >      FILTER (?title = "Harry Potter and the Half-Blood Prince")
>  >     }
>  >   UNION
>  >     {
>  >      FILTER (?title = "Harry Potter and the Order of the Phoenix")
>  >     }
>  >   }
>  >
>  > This query unions two filter clauses, so clearly UNION is no 
> longer  > (just) a graph operator.
>  >
>  > I was told (quite) a while back that UNION was not a disjunction 
> > connective, i.e., it was not equivalent to the traditional "OR"
>  > in the predicate calculus.  I didn't really understand the 
> distinction  > at the time, but later theorized that perhaps it was 
> this the  > difference between a graph operator and an expression 
operator.
>  > If so, that distinction is no longer valid.
>  >
>  > The above query can also be expressed as follows:
>  >
>  > PREFIX books:   <http://example.org/book/>
>  > PREFIX dc:      <http://purl.org/dc/elements/1.1/>
>  > SELECT ?book ?title
>  > WHERE
>  >   { ?book dc:title ?title
>  >     FILTER
>  >             (?title = "Harry Potter and the Half-Blood Prince" ||
>  >              ?title = "Harry Potter and the Order of the Phoenix")
>  >   }
>  >
>  > Once upon a time, the "||" operator was the only way to achieve 
> > a disjunction within a FILTER expression.  Now, the "||" is  > 
> redundant; the language is equally expressive if we eliminate it  > 
> from the grammar.  The same is true for the "&&" operator.  The  > query 
 >
>  > PREFIX books:   <http://example.org/book/>
>  > PREFIX dc:      <http://purl.org/dc/elements/1.1/>
>  > SELECT ?book ?title
>  > WHERE
>  >   { ?book dc:title ?title
>  >     FILTER (
>  >             regex(?title, "Harry Potter") &&
>  >             regex(?title, "Phoenix")
>  >            )
>  >   }
>  >
>  > can also be expressed as
>  >
>  > PREFIX books:   <http://example.org/book/>
>  > PREFIX dc:      <http://purl.org/dc/elements/1.1/>
>  > SELECT ?book ?title
>  > WHERE
>  >   { ?book dc:title ?title .
>  >     {
>  >     FILTER regex(?title, "Harry Potter")
>  >     }
>  >    .
>  >     {
>  >      FILTER regex(?title, "Phoenix")
>  >     }
>  >   }
>  >
>  > I would claim that a well-designed language should not include  >
> redundant operators, and that the primary justification for  > 
> including "||" and "&&" is now historical.  I would also claim  > 
> that the original justification for choosing the term "UNION" in  > 
> preference to the term "OR" is now gone (but perhaps there is  > a 
> wrinkle that I'm not aware of?).  And if I were king, I would  > use
> the term "AND" in preference to ".".
>  >
>  > What we have in SPARQL is a language that has evolved, but the  >
> syntax has not uniformly evolved with it.
>  >
>  > The term "FILTER" is also quite unfortunate.  I have been told  >
> that there has been a deliberate decision to forbid a syntax  > that
> allows query variables to be bound to literal values that do not  > 
> appear within the underlying RDF store.  Our own query processor 
> does  > not have that restriction, and we have any number of use 
> cases in our  > own applications that depend on the ability to 
> generate synthetic  > literals.  The SPARQL 'funcall' invokes a 
> predicate that returns  > a boolean value.  Our equivalent of 
> funcall can return arbitrary  > values.  In our library of ~40 
> system-defined funcall IRIs, the functions  > outnumber the 
> predicates by more than 3-to-1.
>  >
>  > The notion of "filter" is in fact too limiting; the additional  >
> power that accompanies the ability to synthesize new literal values 
> > (and allowing them to be bound to SELECT variables) is huge, and 
> > languages that go beyond the filter notion are going to dominate 
> > those that don't.
>  >
>  > Summarizing.  The notion of SPARQL as a graph language, rather 
> than  > as a calculus language, is already obsolete.  The syntax of 
> the language  > has not been upgraded to accomodate that conceptual 
> shift.  Instead,  > there is an artificial syntactic barrier between
> the  > "graph" portion of the language and the non-graph portion. 
> SQL and  > Common Logic provide examples of logic languages that did
> not make  > that distinction, and as a result are much cleaner, and 
> much more  > readable.
>  >
>  > Cheers, Bob
> 
> 
> 
> 

Received on Tuesday, 13 February 2007 20:07:22 UTC