- From: Andy Seaborne <andy@apache.org>
- Date: Sat, 11 Feb 2017 15:36:41 +0000
- To: public-sparql-exists@w3.org
On 10/02/17 13:21, Peter F. Patel-Schneider wrote: > I think that you have to show that no form is sensitive to the variable name > in any other way, for example removing a solution because of some interaction > between a variable name and anything else. There is no way to get the name of a variable in expression evaluation. Nor is there a way to introduce a variable during execution except PrjMap when all variables are "fresh" - so making the "fresh" clear it means as you go along, not a static choice at the start of PrjMap (which I have already said is a good thing to say) means there is no clash. There is no quoting variables and no eval()-like functionality to create expressions during evaluation either. > And now that I think of it, I don't think that the analysis below is adequate > as it doesn't show that the variable name doesn't somehow end up in the value > side of the binding. The value side of a binding is an RDFTerm or an error. Named variables are not RDF terms. A SPARQL function can not return a variable or expression, nor can the pattern binding operations create bindings from variable to a variable or expression. Peter - if you know of problem areas, can we have an example please? Andy > > peter > > > On 02/10/2017 04:21 AM, Andy Seaborne wrote: >> >> >> On 09/02/17 15:15, Peter F. Patel-Schneider wrote: >>> OK, that's another piece of the puzzle. >>> >>> Any more pieces done? >> >> Are you referring to SHACL? >> >> For EXISTS, both your comments have answers. >> >> Andy >> >>> >>> peter >>> >>> On 02/09/2017 06:08 AM, Andy Seaborne wrote: >>>> There are four forms that create bindings, all the rest recombine bindings in >>>> solution mapping/sequences into other solution mapping/sequences but not >>>> create or modify bindings. >>>> >>>> >>>> Basic Graph Pattern Matching >>>> Property Path Patterns >>>> GRAPH ?variable >>>> AS >>>> >>>> All of these only create bindings according to the variable names used in >>>> their form and do not introduce a free choice of by name. >>>> >>>> A systematic renaming therefore changes the form and it's outcome. >>>> >>>> eval(rename_algebra(X)) is equivalent to rename_solution(eval(X)) >>>> >>>> Andy >>>> >>>> On 08/02/17 18:20, Peter F. Patel-Schneider wrote: >>>>> So the idea appears to be to show that the only effect of variable >>>>> replacement >>>>> in an algebra expression is to systematically change the variable in the >>>>> resulting solution sequence. Now this hypothesis has to be turned into an >>>>> inductive hypothesis and proven to be true for all of the SPARQL algebra. If >>>>> this can be done, then it is easy to show the PrjMap(P,PV) is fine because >>>>> the >>>>> variable doesn't show up in the resulting solution sequence. >>>>> >>>>> peter >>>>> >>>>> On 02/07/2017 03:30 AM, Andy Seaborne wrote: >>>>>> >>>>>> >>>>>> On 06/02/17 11:48, Peter F. Patel-Schneider wrote: >>>>>>> On 02/06/2017 02:21 AM, Andy Seaborne wrote: >>>>>>>> Peter has some comments on the SHACL comments list that relate to EXISTS. >>>>>>>> [1] >>>>>>>> >>>>>>>>> There is no demonstration that the choice of fresh variables in the >>>>>>>>> definition of PrjMap(P,PV) is insignificant. >>>>>>>> >>>>>>>> I hope we can explain in the document to clarify this, but I'm not clear >>>>>>>> what >>>>>>>> you are looking for. >>>>>>> >>>>>>> That the result of the evaluation doesn't depend on the choice of free >>>>>>> variables. >>>>>>> >>>>>>>> What would constitute such a demonstration? >>>>>>> >>>>>>> That's a good question. When I first noticed this problem I was thinking >>>>>>> that >>>>>>> this was just a t that hadn't been dotted, but I'm really not sure how >>>>>>> to go >>>>>>> about showing that the choice doesn't matter. >>>>>>> >>>>>>>> Do you have a example where it is significant? >>>>>>> >>>>>>> No, at least not yet? Do you have a demonstration that there are none? >>>>>> >>>>>> An explanation: >>>>>> >>>>>> Evaluation of a projection results in a solution sequence that can contains >>>>>> only variables of the projection and no others. >>>>>> >>>>>> For any algebra expression, replacing a variable systematically with a fresh >>>>>> variable has a visible effect as a change in the solution sequence >>>>>> binding for >>>>>> that variable. >>>>>> >>>>>> You can't find the name of a variable during the evaluation of a SPARQL >>>>>> query >>>>>> (because graph patterns and solution modifiers are not available as >>>>>> datastructures to access). It's call-by-value and even the special forms >>>>>> like >>>>>> IF, and COALESCE don't expose the variable name because they only change >>>>>> when >>>>>> arguments are evaluated, not pass down the argument expression itself. >>>>>> >>>>>> (( >>>>>> The best I can think of is expressions that associate a value with a >>>>>> variable >>>>>> like >>>>>> >>>>>> IF ( bound(?x) , "x", "not x") >>>>>> >>>>>> but that's more of an alias, not the variable name itself. Renaming ?x is >>>>>> not >>>>>> observable and the alias is unchanged. >>>>>> )) >>>>>> >>>>>> For a projection, one can rename the unprojected variables of the expression >>>>>> over which the project operates because the renaming changes the solution >>>>>> sequence before projection only on variables that projection does not >>>>>> expose. >>>>>> >>>>>> It is not visible in the solution sequence result of the projection. >>>>>> >>>>>> Another way of thinking about it is that the binding due to unprojected >>>>>> variables are not accessible to operations that use the result of the >>>>>> projection. >>>>>> >>>>>>> >>>>>>>>> The result of PrjMap(X) depends on the order in which the projections >>>>>>>>> in X are chosen, but this order is not specified. >>>>>>>> >>>>>>>> Yes - it would be better to define the order and the outcome is order >>>>>>>> dependent with respect to replaced variables but does it make a >>>>>>>> difference? It >>>>>>>> is only variables restricted by scope that are changed. >>>>>>> >>>>>>> The mappings reach down into sub-expressions and change disconnected >>>>>>> variables >>>>>>> there so they violate the scoping of SPARQL. >>>>>> >>>>>> In fact, there is a design choice here - either choice is workable, both >>>>>> have >>>>>> use cases for different audiences. It's not a technical issue - it's a >>>>>> judgement. >>>>>> >>>>>> The other design is one where there is no remapping variables and then the >>>>>> EXISTS insertion of the current row would affect the disconnected variables. >>>>>> >>>>>> It violates the property of SPARQL evaluation that renaming inside >>>>>> project of >>>>>> disconnected variables does not matter anywhere else. Optimizers and >>>>>> parallel >>>>>> execution exploit that property. (I got a related question from someone >>>>>> about >>>>>> this last week - they are implementing some kind of optimized evaluation and >>>>>> wanted to discuss the details.) >>>>>> >>>>>>>> Do you have a case where it makes an observable difference? >>>>>>> >>>>>>> NO, at least not yet. Do you have a demonstration that there are none? >>>>>>> >>>>>>>> Would a bottom-up replacement be suitable? >>>>>>> >>>>>>> I think so. If you fixed the free variables for all the mappings then I >>>>>>> think >>>>>>> that a bottom-up replacement schedule would produce a unique result. This >>>>>>> remains to be demonstrated but shouldn't be too hard. As well, none of the >>>>>>> mappings would affect disconnected variables, I think. >>>>>> >>>>>> bottom-up is the safest (the fresh variables must be fresh across all the >>>>>> renaming going on) and easiest to explain. >>>>>> >>>>>> The requirement is top-down one : to rename at first SELECT down every >>>>>> branch >>>>>> of the expression tree where the variable is hidden. If done top-down, >>>>>> it is >>>>>> renamed once. >>>>>> >>>>>> It is minimal renaming if done for each variable of scope of the current row >>>>>> is considered separately. That makes it more complicated - it might be worth >>>>>> non-definitional text to say this but the more direct definition is a >>>>>> bottom-up walk; even left-to-right, bottom-up to give a unique walk order. >>>>>> >>>>>>> Of course, doing only this part doesn't solve the major problem. >>>>>> >>>>>> I'll leave the SHACL specific comments to the SHACL WG and comments list. >>>>>> >>>>>>> >>>>>>>> Andy >>>>>>>> >>>>>>>> [1] >>>>>>>> https://lists.w3.org/Archives/Public/public-rdf-shapes/2017Jan/0010.html >>>>>>> >>>>>>> It seems so obvious that the choice of variables does not matter, but >>>>>>> thinking >>>>>>> about how to demonstrate that this is so leads to lots of tricky bits. >>>>>>> >>>>>>> peter >>>>>>> >>> >> >
Received on Saturday, 11 February 2017 15:37:17 UTC