- From: Peter F. Patel-Schneider <pfpschneider@gmail.com>
- Date: Fri, 10 Feb 2017 05:21:22 -0800
- To: Andy Seaborne <andy.seaborne@topquadrant.com>, public-sparql-exists@w3.org
I think that you have to show that no form is sensitive to the variable name in any other way, for example removing a solution because of some interaction between a variable name and anything else. And now that I think of it, I don't think that the analysis below is adequate as it doesn't show that the variable name doesn't somehow end up in the value side of the binding. peter On 02/10/2017 04:21 AM, Andy Seaborne wrote: > > > On 09/02/17 15:15, Peter F. Patel-Schneider wrote: >> OK, that's another piece of the puzzle. >> >> Any more pieces done? > > Are you referring to SHACL? > > For EXISTS, both your comments have answers. > > Andy > >> >> peter >> >> On 02/09/2017 06:08 AM, Andy Seaborne wrote: >>> There are four forms that create bindings, all the rest recombine bindings in >>> solution mapping/sequences into other solution mapping/sequences but not >>> create or modify bindings. >>> >>> >>> Basic Graph Pattern Matching >>> Property Path Patterns >>> GRAPH ?variable >>> AS >>> >>> All of these only create bindings according to the variable names used in >>> their form and do not introduce a free choice of by name. >>> >>> A systematic renaming therefore changes the form and it's outcome. >>> >>> eval(rename_algebra(X)) is equivalent to rename_solution(eval(X)) >>> >>> Andy >>> >>> On 08/02/17 18:20, Peter F. Patel-Schneider wrote: >>>> So the idea appears to be to show that the only effect of variable >>>> replacement >>>> in an algebra expression is to systematically change the variable in the >>>> resulting solution sequence. Now this hypothesis has to be turned into an >>>> inductive hypothesis and proven to be true for all of the SPARQL algebra. If >>>> this can be done, then it is easy to show the PrjMap(P,PV) is fine because >>>> the >>>> variable doesn't show up in the resulting solution sequence. >>>> >>>> peter >>>> >>>> On 02/07/2017 03:30 AM, Andy Seaborne wrote: >>>>> >>>>> >>>>> On 06/02/17 11:48, Peter F. Patel-Schneider wrote: >>>>>> On 02/06/2017 02:21 AM, Andy Seaborne wrote: >>>>>>> Peter has some comments on the SHACL comments list that relate to EXISTS. >>>>>>> [1] >>>>>>> >>>>>>>> There is no demonstration that the choice of fresh variables in the >>>>>>>> definition of PrjMap(P,PV) is insignificant. >>>>>>> >>>>>>> I hope we can explain in the document to clarify this, but I'm not clear >>>>>>> what >>>>>>> you are looking for. >>>>>> >>>>>> That the result of the evaluation doesn't depend on the choice of free >>>>>> variables. >>>>>> >>>>>>> What would constitute such a demonstration? >>>>>> >>>>>> That's a good question. When I first noticed this problem I was thinking >>>>>> that >>>>>> this was just a t that hadn't been dotted, but I'm really not sure how >>>>>> to go >>>>>> about showing that the choice doesn't matter. >>>>>> >>>>>>> Do you have a example where it is significant? >>>>>> >>>>>> No, at least not yet? Do you have a demonstration that there are none? >>>>> >>>>> An explanation: >>>>> >>>>> Evaluation of a projection results in a solution sequence that can contains >>>>> only variables of the projection and no others. >>>>> >>>>> For any algebra expression, replacing a variable systematically with a fresh >>>>> variable has a visible effect as a change in the solution sequence >>>>> binding for >>>>> that variable. >>>>> >>>>> You can't find the name of a variable during the evaluation of a SPARQL >>>>> query >>>>> (because graph patterns and solution modifiers are not available as >>>>> datastructures to access). It's call-by-value and even the special forms >>>>> like >>>>> IF, and COALESCE don't expose the variable name because they only change >>>>> when >>>>> arguments are evaluated, not pass down the argument expression itself. >>>>> >>>>> (( >>>>> The best I can think of is expressions that associate a value with a >>>>> variable >>>>> like >>>>> >>>>> IF ( bound(?x) , "x", "not x") >>>>> >>>>> but that's more of an alias, not the variable name itself. Renaming ?x is >>>>> not >>>>> observable and the alias is unchanged. >>>>> )) >>>>> >>>>> For a projection, one can rename the unprojected variables of the expression >>>>> over which the project operates because the renaming changes the solution >>>>> sequence before projection only on variables that projection does not >>>>> expose. >>>>> >>>>> It is not visible in the solution sequence result of the projection. >>>>> >>>>> Another way of thinking about it is that the binding due to unprojected >>>>> variables are not accessible to operations that use the result of the >>>>> projection. >>>>> >>>>>> >>>>>>>> The result of PrjMap(X) depends on the order in which the projections >>>>>>>> in X are chosen, but this order is not specified. >>>>>>> >>>>>>> Yes - it would be better to define the order and the outcome is order >>>>>>> dependent with respect to replaced variables but does it make a >>>>>>> difference? It >>>>>>> is only variables restricted by scope that are changed. >>>>>> >>>>>> The mappings reach down into sub-expressions and change disconnected >>>>>> variables >>>>>> there so they violate the scoping of SPARQL. >>>>> >>>>> In fact, there is a design choice here - either choice is workable, both >>>>> have >>>>> use cases for different audiences. It's not a technical issue - it's a >>>>> judgement. >>>>> >>>>> The other design is one where there is no remapping variables and then the >>>>> EXISTS insertion of the current row would affect the disconnected variables. >>>>> >>>>> It violates the property of SPARQL evaluation that renaming inside >>>>> project of >>>>> disconnected variables does not matter anywhere else. Optimizers and >>>>> parallel >>>>> execution exploit that property. (I got a related question from someone >>>>> about >>>>> this last week - they are implementing some kind of optimized evaluation and >>>>> wanted to discuss the details.) >>>>> >>>>>>> Do you have a case where it makes an observable difference? >>>>>> >>>>>> NO, at least not yet. Do you have a demonstration that there are none? >>>>>> >>>>>>> Would a bottom-up replacement be suitable? >>>>>> >>>>>> I think so. If you fixed the free variables for all the mappings then I >>>>>> think >>>>>> that a bottom-up replacement schedule would produce a unique result. This >>>>>> remains to be demonstrated but shouldn't be too hard. As well, none of the >>>>>> mappings would affect disconnected variables, I think. >>>>> >>>>> bottom-up is the safest (the fresh variables must be fresh across all the >>>>> renaming going on) and easiest to explain. >>>>> >>>>> The requirement is top-down one : to rename at first SELECT down every >>>>> branch >>>>> of the expression tree where the variable is hidden. If done top-down, >>>>> it is >>>>> renamed once. >>>>> >>>>> It is minimal renaming if done for each variable of scope of the current row >>>>> is considered separately. That makes it more complicated - it might be worth >>>>> non-definitional text to say this but the more direct definition is a >>>>> bottom-up walk; even left-to-right, bottom-up to give a unique walk order. >>>>> >>>>>> Of course, doing only this part doesn't solve the major problem. >>>>> >>>>> I'll leave the SHACL specific comments to the SHACL WG and comments list. >>>>> >>>>>> >>>>>>> Andy >>>>>>> >>>>>>> [1] >>>>>>> https://lists.w3.org/Archives/Public/public-rdf-shapes/2017Jan/0010.html >>>>>> >>>>>> It seems so obvious that the choice of variables does not matter, but >>>>>> thinking >>>>>> about how to demonstrate that this is so leads to lots of tricky bits. >>>>>> >>>>>> peter >>>>>> >> >
Received on Friday, 10 February 2017 13:21:58 UTC