- From: Andy Seaborne <andy.seaborne@topquadrant.com>
- Date: Fri, 10 Feb 2017 12:21:28 +0000
- To: public-sparql-exists@w3.org
On 09/02/17 15:15, Peter F. Patel-Schneider wrote: > OK, that's another piece of the puzzle. > > Any more pieces done? Are you referring to SHACL? For EXISTS, both your comments have answers. Andy > > peter > > On 02/09/2017 06:08 AM, Andy Seaborne wrote: >> There are four forms that create bindings, all the rest recombine bindings in >> solution mapping/sequences into other solution mapping/sequences but not >> create or modify bindings. >> >> >> Basic Graph Pattern Matching >> Property Path Patterns >> GRAPH ?variable >> AS >> >> All of these only create bindings according to the variable names used in >> their form and do not introduce a free choice of by name. >> >> A systematic renaming therefore changes the form and it's outcome. >> >> eval(rename_algebra(X)) is equivalent to rename_solution(eval(X)) >> >> Andy >> >> On 08/02/17 18:20, Peter F. Patel-Schneider wrote: >>> So the idea appears to be to show that the only effect of variable replacement >>> in an algebra expression is to systematically change the variable in the >>> resulting solution sequence. Now this hypothesis has to be turned into an >>> inductive hypothesis and proven to be true for all of the SPARQL algebra. If >>> this can be done, then it is easy to show the PrjMap(P,PV) is fine because the >>> variable doesn't show up in the resulting solution sequence. >>> >>> peter >>> >>> On 02/07/2017 03:30 AM, Andy Seaborne wrote: >>>> >>>> >>>> On 06/02/17 11:48, Peter F. Patel-Schneider wrote: >>>>> On 02/06/2017 02:21 AM, Andy Seaborne wrote: >>>>>> Peter has some comments on the SHACL comments list that relate to EXISTS. >>>>>> [1] >>>>>> >>>>>>> There is no demonstration that the choice of fresh variables in the >>>>>>> definition of PrjMap(P,PV) is insignificant. >>>>>> >>>>>> I hope we can explain in the document to clarify this, but I'm not clear >>>>>> what >>>>>> you are looking for. >>>>> >>>>> That the result of the evaluation doesn't depend on the choice of free >>>>> variables. >>>>> >>>>>> What would constitute such a demonstration? >>>>> >>>>> That's a good question. When I first noticed this problem I was thinking >>>>> that >>>>> this was just a t that hadn't been dotted, but I'm really not sure how to go >>>>> about showing that the choice doesn't matter. >>>>> >>>>>> Do you have a example where it is significant? >>>>> >>>>> No, at least not yet? Do you have a demonstration that there are none? >>>> >>>> An explanation: >>>> >>>> Evaluation of a projection results in a solution sequence that can contains >>>> only variables of the projection and no others. >>>> >>>> For any algebra expression, replacing a variable systematically with a fresh >>>> variable has a visible effect as a change in the solution sequence binding for >>>> that variable. >>>> >>>> You can't find the name of a variable during the evaluation of a SPARQL query >>>> (because graph patterns and solution modifiers are not available as >>>> datastructures to access). It's call-by-value and even the special forms like >>>> IF, and COALESCE don't expose the variable name because they only change when >>>> arguments are evaluated, not pass down the argument expression itself. >>>> >>>> (( >>>> The best I can think of is expressions that associate a value with a variable >>>> like >>>> >>>> IF ( bound(?x) , "x", "not x") >>>> >>>> but that's more of an alias, not the variable name itself. Renaming ?x is not >>>> observable and the alias is unchanged. >>>> )) >>>> >>>> For a projection, one can rename the unprojected variables of the expression >>>> over which the project operates because the renaming changes the solution >>>> sequence before projection only on variables that projection does not expose. >>>> >>>> It is not visible in the solution sequence result of the projection. >>>> >>>> Another way of thinking about it is that the binding due to unprojected >>>> variables are not accessible to operations that use the result of the >>>> projection. >>>> >>>>> >>>>>>> The result of PrjMap(X) depends on the order in which the projections >>>>>>> in X are chosen, but this order is not specified. >>>>>> >>>>>> Yes - it would be better to define the order and the outcome is order >>>>>> dependent with respect to replaced variables but does it make a >>>>>> difference? It >>>>>> is only variables restricted by scope that are changed. >>>>> >>>>> The mappings reach down into sub-expressions and change disconnected >>>>> variables >>>>> there so they violate the scoping of SPARQL. >>>> >>>> In fact, there is a design choice here - either choice is workable, both have >>>> use cases for different audiences. It's not a technical issue - it's a >>>> judgement. >>>> >>>> The other design is one where there is no remapping variables and then the >>>> EXISTS insertion of the current row would affect the disconnected variables. >>>> >>>> It violates the property of SPARQL evaluation that renaming inside project of >>>> disconnected variables does not matter anywhere else. Optimizers and parallel >>>> execution exploit that property. (I got a related question from someone about >>>> this last week - they are implementing some kind of optimized evaluation and >>>> wanted to discuss the details.) >>>> >>>>>> Do you have a case where it makes an observable difference? >>>>> >>>>> NO, at least not yet. Do you have a demonstration that there are none? >>>>> >>>>>> Would a bottom-up replacement be suitable? >>>>> >>>>> I think so. If you fixed the free variables for all the mappings then I >>>>> think >>>>> that a bottom-up replacement schedule would produce a unique result. This >>>>> remains to be demonstrated but shouldn't be too hard. As well, none of the >>>>> mappings would affect disconnected variables, I think. >>>> >>>> bottom-up is the safest (the fresh variables must be fresh across all the >>>> renaming going on) and easiest to explain. >>>> >>>> The requirement is top-down one : to rename at first SELECT down every branch >>>> of the expression tree where the variable is hidden. If done top-down, it is >>>> renamed once. >>>> >>>> It is minimal renaming if done for each variable of scope of the current row >>>> is considered separately. That makes it more complicated - it might be worth >>>> non-definitional text to say this but the more direct definition is a >>>> bottom-up walk; even left-to-right, bottom-up to give a unique walk order. >>>> >>>>> Of course, doing only this part doesn't solve the major problem. >>>> >>>> I'll leave the SHACL specific comments to the SHACL WG and comments list. >>>> >>>>> >>>>>> Andy >>>>>> >>>>>> [1] https://lists.w3.org/Archives/Public/public-rdf-shapes/2017Jan/0010.html >>>>> >>>>> It seems so obvious that the choice of variables does not matter, but >>>>> thinking >>>>> about how to demonstrate that this is so leads to lots of tricky bits. >>>>> >>>>> peter >>>>> >
Received on Friday, 10 February 2017 12:22:04 UTC