- From: Peter F. Patel-Schneider <pfpschneider@gmail.com>
- Date: Fri, 17 Jun 2016 11:38:04 -0700
- To: james anderson <james@dydra.com>, public-sparql-dev@w3.org
My summary of my position: The current definition of EXISTS in SPARQL is broken bad. Fixing EXISTS is not just a matter of clarifying and extending its definition but has to instead modify the definition in a way that changes the behaviour of SPARQL as currently defined. This will legitimize parts of current SPARQL implementations that are currently non-conformant with the SPARQL definition and also delegitimize parts of current SPARQL implementations that are currently supported by the SPARQL definition. My summary of james's position: The current definition of EXISTS in SPARQL is incomplete and needs to be extended in some places to implement SPARQL. Fixing these parts of the definition of EXISTS just requires selecting a set of extensions as the ones that will be part of its extended definition and modifying parts of SPARQL implementations that have made extensions that do not conform with this extended definition of EXISTS. [I previously accidentally replied without copying the mailing list. I've tried to extract the relevant parts of the private exchange and am posting this to the mailing list with James's permission, extending the conversation with my response, which leads to the summary of our two positions above.] On 06/17/2016 08:19 AM, james anderson wrote: > >> On 2016-06-17, at 16:23, Peter F. Patel-Schneider wrote: [...] >>> james >>>> peter >>>>> james [...] The discussion is about what the SPARQL 1.1 Query specification says is the meaning of >>>>> 1 SELECT ?x WHERE { >>>>> 2 ?x a ?y . >>>>> 3 FILTER EXISTS { >>>>> 4 SELECT ?z WHERE { ?z ?w ?v . FILTER sameTerm(?x,?y) } >>>>> 5 } >>>>> 6 } which is translated into the algebra structure: >>>>> 1 (base <http://example/base/> >>>>> 2 (project (?x) >>>>> 3 (filter (exists >>>>> 4 (project (?z) >>>>> 5 (filter (sameTerm ?x ?y) >>>>> 6 (bgp (triple ?z ?w ?v))))) >>>>> 7 (bgp (triple ?x <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> ?y))))) The definition of exists is based on the definition of substitute, which is >>>> ************************** >>>> Definition: Substitute >>>> >>>> Let μ be a solution mapping. >>>> >>>> substitute(pattern, μ) = the pattern formed by replacing every occurrence >>>> of a variable v in pattern by μ(v) for each v in dom(μ) >>> >>> sections five and eighteen describe a number of “patterns”. none of which >>> includes “select”. >>> >>> which means i read the recommendation to leave the particular operation, which >>> you and hernández are exploring, as undefined. [james has argued that one reason for this reading is that other readings produce results that run counter to a notion of variable scoping.] >>> in which situation, why first set up the untenable straw man, rather than >>> going straight to a useful definition? >> >> I don't see [reading this situation as undefined] as either the intent or the >> formal meaning of substitute. >> >> As far as formal meaning goes, "pattern" is simply the argument to substitute, >> whatever that is. It doesn't have to be anything that is a pattern at all. >> Of course, it should be something that is somehow a pattern, because otherwise >> the name is misleading. > > following that approach, the entire sparql form is a "pattern" and the word is > left with little consequence. True, but so what? Using "pattern" as the name of a formal variable doesn't have any consequence. It could have been any other name, like "xyxyxy" without affecting the formal meaning of substitute. Using pattern in the description itself does allow for a bit of wiggle room. However, the entre definition for substitute is given above. If this definition doesn't hold when the actual value for the formal variable or the result of substitute is somehow not a pattern then the meaning of such constructs will be unspecified. This defeats the desirability of using this wiggle room. Further, the wiggle room completely goes away if the SELECT is wrapped inside a graph pattern construct, as happens in ... EXISTS { { SELECT ... } } Here both the argument to substitute and its value are Group constructs and thus actual graph patterns in the SPARQL algebra. > we understand the recommendation differently. > in particular, from the document context, it should be a “graph pattern”. You appear to be saying that definition of substitute should be ************* Definition: Substitute Let μ be a solution mapping substitute(pattern, μ) = the pattern formed by replacing every occurrence of a variable v in pattern by μ(v) for each v in dom(μ), for pattern a SPARQL algebra construct that is a graph pattern substitute(xyxy, μ) = xyxy, otherwise ************* This has several problems. In particular, it substitutes into Projects that are inside Joins, as would result from FILTER EXISTS { BIND ( "A" as ?y ) { SELECT ... } } so it violates the same scope rules as before. Or maybe you are saying that it should be ************* Definition: Substitute Let μ be a solution mapping substitute(pattern, μ) = the pattern formed by replacing every in-sope occurrence of a variable v in pattern by μ(v) for each v in dom(μ), for pattern a SPARQL algebra construct that is a graph pattern substitute(xyxy, μ) = xyxy, otherwise ************* This at least adds the scope rules from SPARQL into the picture. Unfortunately the scope rules in SPARQL don't help here, for multiple reasons, including that they do not distinguish between what might be considered different variables with the same name. Or maybe substitute should drop variables from the solution mapping as soon as the variable goes out of scope. That's sounding better, but needs careful consideration of just when to drop variables and what to do with, for example, Projects whose variable has been substituted. As well, there is no notion of scope in the SPARQL algebra so this has to be either defined there or scope dragged into the algebra during translation or some other method for determining when to drop variables devised that matches scoping. A correct definition of this version of substitute is going to be very different from the definition in the specification. I'm actually strongly in favour of changing EXISTS to work somewhat like this, but that's a decided change from the SPARQL 1.1 Query specification and certainly not a clarification of something that is ambiguous there or a minor obvious fix. >> Intent is harder to decipher, but it appears that at least Virtuoso implements >> substitute by substituting inside project constructs, which are what >> subselects turn into. (I am arguing that this is the wrong meaning but there >> is no good evidence in the SPARQL 1.1 Query specification to support the claim >> that there is any other intent than one that is supported by the formal >> meaning.) > > i empathise with them, but would not have followed that approach as it > violates otherwise clear scoping rules. I'm not actually sure what "otherwise clear scoping rules" you mean here. SPARQL has a notion of in-scope, but that doesn't match scoping of variables in programming languages. Perhaps you actually mean to use something like programming language scope, but I don't see any evidence that SPARQL uses that notion at all. > my approach, as indicated earlier, has been to employ dynamic bindings, with > the intent follow the incompletely defined intent of substitution, to the > extent is can be reconciled with lexical contours, as established by binding > forms. Well I suppose that that is a possible way to implement something, but I don't think that that ends up conforming to the definition of SPARQL. >> In the example above the syntactic argument to EXISTS is, as always, a >> GroupGraphPattern. One of the options for GroupGraphPattern is '{' SubSelect >> '}', which is the case above. > > i have always interpreted this situation to be a consequence of the general > sloppiness of the specification. > that is, > - GroupGraphPattern appears in that situation in 1.0, > - we need to include selects there, > - let us do that and it does not matter that the clause is still called a > “pattern". > i would never have ventured to impute intent from that editorial change. That's an editorial change? I don't think that changes to the grammar can be considered to be editorial. >> And SubSelect is a somewhat restricted version >> of a SELECT query. So from syntactic grounds the syntactic argument to EXISTS >> is something that is a kind of pattern and this is support for using "pattern" >> here. > > we disagree. OK, we disagree. >> So I don't see how it can be argued that the SPARQL 1.1 Query document says that >> >> substitute((project (?z) (filter (sameTerm ?x ?y) (bgp (triple ?z ?w ?v))))), >> {(x,"A"), (y,"A")}) >> >> is anything other than >> >> (project (?z) (filter (sameTerm "A" "A") (bgp (triple ?z ?w ?v)))) > > we disagree. > > in general, you are arguing from failure to define and sloppy exposition to an > incorrect intent. Not at all. I am arguing from the formal definition of substitute in the SPARQL 1.1 Query specification, https://www.w3.org/TR/2013/REC-sparql11-query-20130321/ This may be a bad definition - it does end up producing terms whose meaning is unspecified in lots of cases and has counter-intuitive results - but there is no way to read the particular application of substitute above in any other way that the way I state. > i do not see benefit to that approach and suggest it would be more worthwhile > to argue directly for clear and adequate definitions. I agree that there needs to be an adequate definition of EXISTS. I further believe that the current definition of EXISTS, and in particular the current definition of substitute, is broken bad. So fixing it is not just a matter of clarifying and extending the definition but has to instead replace the definition in a way that changes the behaviour of SPARQL. To argue otherwise is to argue that some current implementations have made unfounded assumptions about EXISTS that don't conform to the correct clarification of the specification and thus should be changed. Instead the argument has to be that it is necessary to change the SPARQL spec in a way that delegitimizes some current implementations that are currently legitimate and legitimizes other current implementations that are currently illegitimate. My hope is that this latter argument will be made successfully! [...] peter
Received on Friday, 17 June 2016 18:38:35 UTC