- From: Markus Krötzsch <markus.kroetzsch@comlab.ox.ac.uk>
- Date: Mon, 02 May 2011 17:56:02 +0100
- To: public-rdf-dawg-comments@w3.org
P.S. To clarify my proposal (and to support my claim that it is an easy change), I have done the main formal modifications that the spec would require [1]. There are four changes: Section 17.4: * Filter is a unary function, working like BGP (results restricted to terms in active graph; could be relaxed to allow all bnodes) * LeftJoin is a binary function, working like LeftJoin(*,*,true) Section 17.2.3: * Variablen in FILTER are always "visible" Section 17.2.1: * The translation of GroupGraphPatterns includes FILTER directly using Join (helper variable FS no longer needed), and no more case distinction happens for OPTIONAL. So everything becomes somewhat shorter/simpler. I have not updated any informal parts (esp. the translation examples). Markus [1] http://korrekt.org/sparql-proposal/ On 01/05/11 20:29, Markus Krötzsch wrote: > Dear WG, > > when working with SPARQL recently, I noticed that certain disjunctive > queries are most cumbersome/inefficient to formulate due to the special > post-processing semantics of FILTER expressions. I have written up a > detailed explanation [1]. In a nutshell: it is *really* hard to combine > FILTERs and BGPs in disjunctions. > > But the problem has a simple fix: > > * Define FILTER in such a way that it can *create* new solution > mappings, just like BGP. A FILTER would create all variable bindings (to > terms from the active graph) that make the filter condition true. > * Instead of applying filters after matching, the generated solution > mappings of a FILTER would directly be joined with other parts of the > query. > > Putting it like this simplifies the whole algebra, both formally and > conceptually. Moreover, I think that practical implementation are > already working like that anyway (using FILTER conditions such as "=" to > pre-generate results instead of waiting until the very end before > "checking" them). > > The only negative effect that I see is that this would change the > meaning of variables that occur in filters but in no BGP. Currently, > such variables are considered "unbound". With the change, they would be > instantiated to all terms that match. Experimenting with FILTER-only > variables in some RDF stores, I merely got error messages (and rightly > so, since a variable that can never be bound is of little use in a > filter). So I assume that this is a corner case of little practical > relevance. > > AFAICT, all other queries would give exactly the same results (joining > having the same effect as filtering). So it seems that I am suggesting a > largely formal algebra change, but one that would make hitherto useless > queries very helpful (e.g. to solve the problem in [1]). > > I am aware that this proposal comes at a very late stage, but I think it > is still feasible to do it. I could help with updating the formal parts > of the algebra. In any case, I would like to hear the opinion of > implementers/practitioners, also re [1]. Note that I am writing this > largely as a user (and teacher) of SPARQL, so when I am investing my > time here it is merely because I am convinced that it would greatly > benefit the language. > > Cheers, > > Markus > > > [1] http://korrekt.org/page/The_State_of_the_UNION > -- Dr. Markus Krötzsch Oxford University Computing Laboratory Room 306, Parks Road, Oxford, OX1 3QD, UK +44 (0)1865 283529 http://korrekt.org/
Received on Monday, 2 May 2011 16:56:24 UTC