- From: Peter F. Patel-Schneider <pfpschneider@gmail.com>
- Date: Sun, 26 Feb 2017 12:16:19 -0800
- To: Andy Seaborne <andy@apache.org>, public-sparql-exists@w3.org
On 02/26/2017 08:32 AM, Andy Seaborne wrote: > With links this time. > > On 22/02/17 20:27, Peter F. Patel-Schneider wrote: >> All I am asking for is a real demonstration that inserting fresh variables is >> a benign operation and that the choice of which fresh variable to use is not >> significant. I view this demonstration as necessary for this proposal. > > I have said I will add text to explain the process - I haven't had time to do > that yet. I don't know what proof would satisfy you, particularly what can be > assumed to be already known, and I don't want to write one only to have it > rejected completely. Well, that's a problem then. I do sympathize, having been in the same situation more than once. However, you just have to put the work in an see what happens. I do think that things will work, or at least I did a couple of days ago. However, I noticed that the definition of the F function isn't going to work. F appears to be defined on all variable, so there are no fresh variables available. So something has to be done to fix up F. >> As >> the results of a SPARQL query do depend on the variable names in it, the >> demonstration is going to be different from the standard demonstration that >> renaming variables in a regular programming language does not change the >> result of a program. > > In this thread we have that: > > * The text for replacement needs to say that fresh variables are unique > through later replacements as well > > * The replacement process should be done bottom-up > > * The name of a variable is not accessible to expressions. [1] > > * Replacing a variable happens in only 4 places [2]. > The name makes the name in the solutions change. > > So a name change of a variable affects the solutions as a rename (c.f. > relational algebra rename operation). > > When applied to a projection, renaming unprojected variables is not observable. Yes, I agree that this all looks OK so far. However, again, I've been in this situation before and when I wrote everything up, crossing all the i's and dotting all the t's, I noticed that there were some holes that I had to patch. Fortunately this was doable. What you appear to be asking for is my unconditional blessing in advance. Sorry, but that's not going to be forthcoming. peter >> This is not my proposal. I view it as inferior partly because it does depend >> on this demonstration. I also view it as inferior as it does not meet the >> goal of not changing the meaning of SPARQL queries that are completely >> unproblematic with the current definition of EXISTS. For example, it >> introduces a shared variable to both sides of MINUS constructs, potentially >> changing their meaning. > > Both proposals make changes around MINUS from what the SPARQL 1.1 spec defines > - this is issue 4. > > Proposal-A changes GRAPH, UNION (and MINUS) queries [3] so that existing > EXISTS uses are changed. This is addressed by proposal-B by making the current > row available throughout the EXISTS pattern evaluation. > > Andy > > > [1] > https://lists.w3.org/Archives/Public/public-sparql-exists/2017Feb/0002.html > > [2] > https://lists.w3.org/Archives/Public/public-sparql-exists/2017Feb/0004.html > https://lists.w3.org/Archives/Public/public-sparql-exists/2017Feb/0005.html > > [3] > https://lists.w3.org/Archives/Public/public-sparql-exists/2016Sep/0027.html > and examples > https://lists.w3.org/Archives/Public/public-sparql-exists/2016Nov/0026.html > > > > >> >> I am not willing to further help with this proposal. >> >> Peter F. Patel-Schneider >> Nuance Communications >> >> >> On 02/14/2017 06:33 AM, Andy Seaborne wrote: >>> Would you like to draft that? >>> >>> I'm able to provide text that explains this I don't think I can to the level >>> you seem to be asking for. >>> >>> It is aliasing and follows from (1) replacing one named variable in an algebra >>> expression, which is a data structure, with a fresh variable, (2) evaluation >>> can not introduce named variables and (3) it's a project. >>> >>> For explanation, saying that the project does not expose the unprojected >>> variables, so the evaluation has the same results after renaming unprojected >>> variables. Much of this come down to "fresh variable" i.e. no clash with an >>> existing one. >>> >>> Andy >>> >>> On 11/02/17 21:28, Peter F. Patel-Schneider wrote: >>>> I don't know of any particular problem areas. If I did, I would say so. >>>> >>>> I also don't know that there are no problem areas. That's what the >>>> demonstration is for - to show that mapping to fresh variables doesn't >>>> affect >>>> anything that it shouoldn't. >>>> >>>> It looks as if the parts may all be in place. The next step is to write it >>>> all down in a formal proof. It may be that writing everything down formally >>>> will expose some missing bits. >>>> >>>> peter >>>> >>>> >>>> On 02/11/2017 07:36 AM, Andy Seaborne wrote: >>>>> >>>>> On 10/02/17 13:21, Peter F. Patel-Schneider wrote: >>>>>> I think that you have to show that no form is sensitive to the variable >>>>>> name >>>>>> in any other way, for example removing a solution because of some >>>>>> interaction >>>>>> between a variable name and anything else. >>>>> >>>>> There is no way to get the name of a variable in expression evaluation. >>>>> >>>>> Nor is there a way to introduce a variable during execution except PrjMap >>>>> when >>>>> all variables are "fresh" - so making the "fresh" clear it means as you go >>>>> along, not a static choice at the start of PrjMap (which I have already said >>>>> is a good thing to say) means there is no clash. >>>>> >>>>> There is no quoting variables and no eval()-like functionality to create >>>>> expressions during evaluation either. >>>>> >>>>>> And now that I think of it, I don't think that the analysis below is >>>>>> adequate >>>>>> as it doesn't show that the variable name doesn't somehow end up in the >>>>>> value >>>>>> side of the binding. >>>>> >>>>> The value side of a binding is an RDFTerm or an error. Named variables >>>>> are not >>>>> RDF terms. A SPARQL function can not return a variable or expression, >>>>> nor can >>>>> the pattern binding operations create bindings from variable to a >>>>> variable or >>>>> expression. >>>>> >>>>> Peter - if you know of problem areas, can we have an example please? >>>>> >>>>> Andy >>>>> >>>>>> >>>>>> peter >>>>>> >>>>>> >>>>>> On 02/10/2017 04:21 AM, Andy Seaborne wrote: >>>>>>> >>>>>>> >>>>>>> On 09/02/17 15:15, Peter F. Patel-Schneider wrote: >>>>>>>> OK, that's another piece of the puzzle. >>>>>>>> >>>>>>>> Any more pieces done? >>>>>>> >>>>>>> Are you referring to SHACL? >>>>>>> >>>>>>> For EXISTS, both your comments have answers. >>>>>>> >>>>>>> Andy >>>>>>> >>>>>>>> >>>>>>>> peter >>>>>>>> >>>>>>>> On 02/09/2017 06:08 AM, Andy Seaborne wrote: >>>>>>>>> There are four forms that create bindings, all the rest recombine >>>>>>>>> bindings in >>>>>>>>> solution mapping/sequences into other solution mapping/sequences but not >>>>>>>>> create or modify bindings. >>>>>>>>> >>>>>>>>> >>>>>>>>> Basic Graph Pattern Matching >>>>>>>>> Property Path Patterns >>>>>>>>> GRAPH ?variable >>>>>>>>> AS >>>>>>>>> >>>>>>>>> All of these only create bindings according to the variable names >>>>>>>>> used in >>>>>>>>> their form and do not introduce a free choice of by name. >>>>>>>>> >>>>>>>>> A systematic renaming therefore changes the form and it's outcome. >>>>>>>>> >>>>>>>>> eval(rename_algebra(X)) is equivalent to rename_solution(eval(X)) >>>>>>>>> >>>>>>>>> Andy >>>>>>>>> >>>>>>>>> On 08/02/17 18:20, Peter F. Patel-Schneider wrote: >>>>>>>>>> So the idea appears to be to show that the only effect of variable >>>>>>>>>> replacement >>>>>>>>>> in an algebra expression is to systematically change the variable in >>>>>>>>>> the >>>>>>>>>> resulting solution sequence. Now this hypothesis has to be turned >>>>>>>>>> into an >>>>>>>>>> inductive hypothesis and proven to be true for all of the SPARQL >>>>>>>>>> algebra. If >>>>>>>>>> this can be done, then it is easy to show the PrjMap(P,PV) is fine >>>>>>>>>> because >>>>>>>>>> the >>>>>>>>>> variable doesn't show up in the resulting solution sequence. >>>>>>>>>> >>>>>>>>>> peter >>>>>>>>>> >>>>>>>>>> On 02/07/2017 03:30 AM, Andy Seaborne wrote: >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On 06/02/17 11:48, Peter F. Patel-Schneider wrote: >>>>>>>>>>>> On 02/06/2017 02:21 AM, Andy Seaborne wrote: >>>>>>>>>>>>> Peter has some comments on the SHACL comments list that relate to >>>>>>>>>>>>> EXISTS. >>>>>>>>>>>>> [1] >>>>>>>>>>>>> >>>>>>>>>>>>>> There is no demonstration that the choice of fresh variables in the >>>>>>>>>>>>>> definition of PrjMap(P,PV) is insignificant. >>>>>>>>>>>>> >>>>>>>>>>>>> I hope we can explain in the document to clarify this, but I'm not >>>>>>>>>>>>> clear >>>>>>>>>>>>> what >>>>>>>>>>>>> you are looking for. >>>>>>>>>>>> >>>>>>>>>>>> That the result of the evaluation doesn't depend on the choice of >>>>>>>>>>>> free >>>>>>>>>>>> variables. >>>>>>>>>>>> >>>>>>>>>>>>> What would constitute such a demonstration? >>>>>>>>>>>> >>>>>>>>>>>> That's a good question. When I first noticed this problem I was >>>>>>>>>>>> thinking >>>>>>>>>>>> that >>>>>>>>>>>> this was just a t that hadn't been dotted, but I'm really not sure >>>>>>>>>>>> how >>>>>>>>>>>> to go >>>>>>>>>>>> about showing that the choice doesn't matter. >>>>>>>>>>>> >>>>>>>>>>>>> Do you have a example where it is significant? >>>>>>>>>>>> >>>>>>>>>>>> No, at least not yet? Do you have a demonstration that there are >>>>>>>>>>>> none? >>>>>>>>>>> >>>>>>>>>>> An explanation: >>>>>>>>>>> >>>>>>>>>>> Evaluation of a projection results in a solution sequence that can >>>>>>>>>>> contains >>>>>>>>>>> only variables of the projection and no others. >>>>>>>>>>> >>>>>>>>>>> For any algebra expression, replacing a variable systematically with a >>>>>>>>>>> fresh >>>>>>>>>>> variable has a visible effect as a change in the solution sequence >>>>>>>>>>> binding for >>>>>>>>>>> that variable. >>>>>>>>>>> >>>>>>>>>>> You can't find the name of a variable during the evaluation of a >>>>>>>>>>> SPARQL >>>>>>>>>>> query >>>>>>>>>>> (because graph patterns and solution modifiers are not available as >>>>>>>>>>> datastructures to access). It's call-by-value and even the special >>>>>>>>>>> forms >>>>>>>>>>> like >>>>>>>>>>> IF, and COALESCE don't expose the variable name because they only >>>>>>>>>>> change >>>>>>>>>>> when >>>>>>>>>>> arguments are evaluated, not pass down the argument expression itself. >>>>>>>>>>> >>>>>>>>>>> (( >>>>>>>>>>> The best I can think of is expressions that associate a value with a >>>>>>>>>>> variable >>>>>>>>>>> like >>>>>>>>>>> >>>>>>>>>>> IF ( bound(?x) , "x", "not x") >>>>>>>>>>> >>>>>>>>>>> but that's more of an alias, not the variable name itself. Renaming >>>>>>>>>>> ?x is >>>>>>>>>>> not >>>>>>>>>>> observable and the alias is unchanged. >>>>>>>>>>> )) >>>>>>>>>>> >>>>>>>>>>> For a projection, one can rename the unprojected variables of the >>>>>>>>>>> expression >>>>>>>>>>> over which the project operates because the renaming changes the >>>>>>>>>>> solution >>>>>>>>>>> sequence before projection only on variables that projection does not >>>>>>>>>>> expose. >>>>>>>>>>> >>>>>>>>>>> It is not visible in the solution sequence result of the projection. >>>>>>>>>>> >>>>>>>>>>> Another way of thinking about it is that the binding due to >>>>>>>>>>> unprojected >>>>>>>>>>> variables are not accessible to operations that use the result of the >>>>>>>>>>> projection. >>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>>>> The result of PrjMap(X) depends on the order in which the >>>>>>>>>>>>>> projections >>>>>>>>>>>>>> in X are chosen, but this order is not specified. >>>>>>>>>>>>> >>>>>>>>>>>>> Yes - it would be better to define the order and the outcome is >>>>>>>>>>>>> order >>>>>>>>>>>>> dependent with respect to replaced variables but does it make a >>>>>>>>>>>>> difference? It >>>>>>>>>>>>> is only variables restricted by scope that are changed. >>>>>>>>>>>> >>>>>>>>>>>> The mappings reach down into sub-expressions and change disconnected >>>>>>>>>>>> variables >>>>>>>>>>>> there so they violate the scoping of SPARQL. >>>>>>>>>>> >>>>>>>>>>> In fact, there is a design choice here - either choice is workable, >>>>>>>>>>> both >>>>>>>>>>> have >>>>>>>>>>> use cases for different audiences. It's not a technical issue - it's a >>>>>>>>>>> judgement. >>>>>>>>>>> >>>>>>>>>>> The other design is one where there is no remapping variables and >>>>>>>>>>> then the >>>>>>>>>>> EXISTS insertion of the current row would affect the disconnected >>>>>>>>>>> variables. >>>>>>>>>>> >>>>>>>>>>> It violates the property of SPARQL evaluation that renaming inside >>>>>>>>>>> project of >>>>>>>>>>> disconnected variables does not matter anywhere else. Optimizers and >>>>>>>>>>> parallel >>>>>>>>>>> execution exploit that property. (I got a related question from >>>>>>>>>>> someone >>>>>>>>>>> about >>>>>>>>>>> this last week - they are implementing some kind of optimized >>>>>>>>>>> evaluation and >>>>>>>>>>> wanted to discuss the details.) >>>>>>>>>>> >>>>>>>>>>>>> Do you have a case where it makes an observable difference? >>>>>>>>>>>> >>>>>>>>>>>> NO, at least not yet. Do you have a demonstration that there are >>>>>>>>>>>> none? >>>>>>>>>>>> >>>>>>>>>>>>> Would a bottom-up replacement be suitable? >>>>>>>>>>>> >>>>>>>>>>>> I think so. If you fixed the free variables for all the mappings >>>>>>>>>>>> then I >>>>>>>>>>>> think >>>>>>>>>>>> that a bottom-up replacement schedule would produce a unique result. >>>>>>>>>>>> This >>>>>>>>>>>> remains to be demonstrated but shouldn't be too hard. As well, >>>>>>>>>>>> none of >>>>>>>>>>>> the >>>>>>>>>>>> mappings would affect disconnected variables, I think. >>>>>>>>>>> >>>>>>>>>>> bottom-up is the safest (the fresh variables must be fresh across all >>>>>>>>>>> the >>>>>>>>>>> renaming going on) and easiest to explain. >>>>>>>>>>> >>>>>>>>>>> The requirement is top-down one : to rename at first SELECT down every >>>>>>>>>>> branch >>>>>>>>>>> of the expression tree where the variable is hidden. If done >>>>>>>>>>> top-down, >>>>>>>>>>> it is >>>>>>>>>>> renamed once. >>>>>>>>>>> >>>>>>>>>>> It is minimal renaming if done for each variable of scope of the >>>>>>>>>>> current row >>>>>>>>>>> is considered separately. That makes it more complicated - it might be >>>>>>>>>>> worth >>>>>>>>>>> non-definitional text to say this but the more direct definition is a >>>>>>>>>>> bottom-up walk; even left-to-right, bottom-up to give a unique walk >>>>>>>>>>> order. >>>>>>>>>>> >>>>>>>>>>>> Of course, doing only this part doesn't solve the major problem. >>>>>>>>>>> >>>>>>>>>>> I'll leave the SHACL specific comments to the SHACL WG and comments >>>>>>>>>>> list. >>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>>> Andy >>>>>>>>>>>>> >>>>>>>>>>>>> [1] >>>>>>>>>>>>> https://lists.w3.org/Archives/Public/public-rdf-shapes/2017Jan/0010.html >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> It seems so obvious that the choice of variables does not matter, but >>>>>>>>>>>> thinking >>>>>>>>>>>> about how to demonstrate that this is so leads to lots of tricky >>>>>>>>>>>> bits. >>>>>>>>>>>> >>>>>>>>>>>> peter >>>>>>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>>
Received on Sunday, 26 February 2017 20:49:09 UTC