- From: Bob MacGregor <bmacgregor@siderean.com>
- Date: Fri, 19 Jan 2007 13:18:42 -0800
- To: "Eric Prud'hommeaux" <eric@w3.org>
- CC: "Andrew Newman" <andrewfnewman@gmail.com> , "public-rdf-dawg-comments@w3.org" <public-rdf-dawg-comments@w3.org>
Quite a while ago, I argued against the notion of splitting a SPARQ query into graph and filter components, but lost that battle. One of the major objections against that split was that the UNION connective was unduly circumscribed in what could be expressed. Since then, I have periodically scanned the "SPARQL Query Language for RDF" document at http://www.w3.org/TR/rdf-sparql-query/ to see if the situation had improved. When the change occurred (which it did), I missed it. Hence, I recently made an erroneous claim on public-rdf-dawg-comments@w3.org that SPARQL was (still) not completely expressive with respect to disjuction. I'm still wondering if the change in SPARQL grammar that increased its expressive power has been accompanied by the (I claim) necessary philosophical shift to account for the difference. The language syntax has not been revised appropriately, which partially accounts for my failure to detect the change. Below, I comment on the syntax as it relates to this shift: RDF has been conceived as different from the predicate calculus in that it deals with *graphs* rather than with arbitrary logical expressions. In the predicate calculus, one can make, for example, non-graph-like disjunctive assertions; in RDF we can't. SPARQL was initially conceived, as nearly as I can tell, as a "graph query language". The UNION operator was an "algebraic" operation that took in two graphs and produced a third. This is no longer the case. Here is a counter example that I executed on the SPARQLer website at http://www.sparql.org/query.html : PREFIX books: <http://example.org/book/> PREFIX dc: <http://purl.org/dc/elements/1.1/> SELECT ?book ?title WHERE { ?book dc:title ?title . { FILTER (?title = "Harry Potter and the Half-Blood Prince") } UNION { FILTER (?title = "Harry Potter and the Order of the Phoenix") } } This query unions two filter clauses, so clearly UNION is no longer (just) a graph operator. I was told (quite) a while back that UNION was not a disjunction connective, i.e., it was not equivalent to the traditional "OR" in the predicate calculus. I didn't really understand the distinction at the time, but later theorized that perhaps it was this the difference between a graph operator and an expression operator. If so, that distinction is no longer valid. The above query can also be expressed as follows: PREFIX books: <http://example.org/book/> PREFIX dc: <http://purl.org/dc/elements/1.1/> SELECT ?book ?title WHERE { ?book dc:title ?title FILTER (?title = "Harry Potter and the Half-Blood Prince" || ?title = "Harry Potter and the Order of the Phoenix") } Once upon a time, the "||" operator was the only way to achieve a disjunction within a FILTER expression. Now, the "||" is redundant; the language is equally expressive if we eliminate it from the grammar. The same is true for the "&&" operator. The query PREFIX books: <http://example.org/book/> PREFIX dc: <http://purl.org/dc/elements/1.1/> SELECT ?book ?title WHERE { ?book dc:title ?title FILTER ( regex(?title, "Harry Potter") && regex(?title, "Phoenix") ) } can also be expressed as PREFIX books: <http://example.org/book/> PREFIX dc: <http://purl.org/dc/elements/1.1/> SELECT ?book ?title WHERE { ?book dc:title ?title . { FILTER regex(?title, "Harry Potter") } . { FILTER regex(?title, "Phoenix") } } I would claim that a well-designed language should not include redundant operators, and that the primary justification for including "||" and "&&" is now historical. I would also claim that the original justification for choosing the term "UNION" in preference to the term "OR" is now gone (but perhaps there is a wrinkle that I'm not aware of?). And if I were king, I would use the term "AND" in preference to ".". What we have in SPARQL is a language that has evolved, but the syntax has not uniformly evolved with it. The term "FILTER" is also quite unfortunate. I have been told that there has been a deliberate decision to forbid a syntax that allows query variables to be bound to literal values that do not appear within the underlying RDF store. Our own query processor does not have that restriction, and we have any number of use cases in our own applications that depend on the ability to generate synthetic literals. The SPARQL 'funcall' invokes a predicate that returns a boolean value. Our equivalent of funcall can return arbitrary values. In our library of ~40 system-defined funcall IRIs, the functions outnumber the predicates by more than 3-to-1. The notion of "filter" is in fact too limiting; the additional power that accompanies the ability to synthesize new literal values (and allowing them to be bound to SELECT variables) is huge, and languages that go beyond the filter notion are going to dominate those that don't. Summarizing. The notion of SPARQL as a graph language, rather than as a calculus language, is already obsolete. The syntax of the language has not been upgraded to accomodate that conceptual shift. Instead, there is an artificial syntactic barrier between the "graph" portion of the language and the non-graph portion. SQL and Common Logic provide examples of logic languages that did not make that distinction, and as a result are much cleaner, and much more readable. Cheers, Bob ----Original Message----- From: public-rdf-dawg-comments-request@w3.org [mailto:public-rdf-dawg-comments-request@w3.org] On Behalf Of Eric Prud'hommeaux Sent: Thursday, January 04, 2007 09:00 To: Bob MacGregor Cc: Andrew Newman; public-rdf-dawg-comments@w3.org Subject: Re: Applying the relational model to SPARQL * Bob MacGregor <bmacgregor@siderean.com> [2006-11-10 08:55-0800] > That brings us to SPARQL. SPARQL is a major disappointment. The most > grievous error is the distinction between the WHERE and FILTER clauses. > The faceted navigation product that my company sells generates RDF > queries that cannot be expressed in SPARQL because they frequently use > an OR connective that includes both statements and filters within the > disjuncts. In otherwords, the queries we routinely execute cannot be > processed by a SPARQL query engine. Could you outline such a query in the syntax of your choice? I may require an english explanation of the query in order to understand it. -- -eric office: +1.617.258.5741 NE43-344, MIT, Cambridge, MA 02144 USA cell: +1.857.222.5741 (eric@w3.org) Feel free to forward this message to any list for any purpose other than email address distribution.
Received on Friday, 19 January 2007 21:20:46 UTC