- From: Peter F. Patel-Schneider <pfpschneider@gmail.com>
- Date: Mon, 18 Apr 2016 19:27:45 -0700
- To: Holger Knublauch <holger@topquadrant.com>, public-data-shapes-wg@w3.org
On 04/18/2016 06:06 PM, Holger Knublauch wrote: > On 19/04/2016 1:39, Peter F. Patel-Schneider wrote: >> I see several problems with this wording: >> >> 1/ SPARQL only performs evaluation in certain situations. For example, in >> >> SELECT $this WHERE { $this :p ?that } >> >> neither this nor that are evaluated at any time. > > I believe both will get evaluated. The query will only return a row if there > is at least one match for the basic graph pattern ?this :p ?that. Evaluating > the BGP will evaluate the variables. As far as can tell SPARQL does not evaluate variables in basic graph patterns. Consider the example from 18.2.3 Example: Pattern involving BIND: { ?s :p ?v . BIND (2*?v AS ?v2) ?s :p1 ?v2 } Join( Extend( BGP(?s :p ?v), ?v2, 2*?v) , BGP(?s :p1 ?v2) ) Note that variables are passed into the BGP expressions without any evaluation happening. >> Some other wording is needed. >> >> 2/ This description appears to be written to that the $this and $predicate in >> the subquery are affected even though they are effectively different >> variables from the ?this and ?predicate in the main query. >> >> SELECT $this ($this AS ?subject) $predicate >> WHERE { >> { >> SELECT (COUNT(?value) AS ?count) >> WHERE { >> $this $predicate ?value . >> } >> } >> FILTER (?count < $minCount) >> } >> >> Is this what pre-binding in SPARQL is supposed to do? If not, some other term >> should >> be used. > > Yes, this is what pre-binding in SPARQL is supposed to do. (And I believe we > have talked about this many times now, and there is even an explicit sentence > about it). The initial wording stated, I think, that pre-binding was like using BIND (or maybe VALUES). This is very different. It would be nice to have some indication that this is indeed what is done in SPARQL implementations. >> Can SPARQL implementations do this at all in an interoperable fashion? > > Yes, implementations for this exist. >> 3/ What sorts of values are allowed in pre-binding? > > Any RDF node. Since I don't exclude any node kinds, this is hopefully clear. But only RDF terms? I don't see any wording to this effect but maybe there is no way to get anything except an RDF term into here. >> 4/ When do "SHACL processors" evaluate occurrences of variables? > > Changed to "SPARQL processors". That was a typo. > >> >> 5/ How does this work with blank nodes? > > Bnodes are nodes like any other here. The "substitution" does not go through > the SPARQL syntax, so it can directly access the node object (e.g. Node > instance in Jena) OK > Latest version online: > > https://github.com/w3c/data-shapes/commit/1457ae924171fae7536102bbcabddfc4f9509d9f > > > Holger > > >> >> >> >> I'm having a hard time finding descriptions of pre-binding for SPARQL >> implementations. Are there any decent descriptions available? >> >> >> peter >> >> >> >> >> >> >> >> On 04/18/2016 06:49 AM, Holger Knublauch wrote: >>> Oops, yes. I should have taken out the "prior...". Let me try again: >>> >>> <span class="term">Pre-binding</span> a variable with a value means that >>> the SHACL processor needs to evaluate all >>> occurrences of variables with that same name >>> (including occurrences in inner scopes and nested SELECT queries) >>> so that they have the provided value. >>> In other words, whenever a SPARQL processor evaluates a pre-bound >>> variable, it must use the given value. >>> >>> I don't see why the term "evaluation time" would be unclear. A SPARQL engine >>> evaluates a query and this happens during a process that takes time. >>> >>> I replaced the term "substitution", so that people don't assume query text >>> replacement. >>> >>> Does the second sentence "In other words..." help or shall I delete that? >>> >>> Thanks, >>> Holger >>> >>> >>> On 18/04/2016 21:57, Peter F. Patel-Schneider wrote: >>>> I don't see how this wording, which appears to be >>>> >>>> <span class="term">pre-binding</span> a variable with a value means that, >>>> prior to evaluating a query, the SHACL processor needs to substitute all >>>> occurrences of variables with the same name at evaluation time (including >>>> inner scopes and nested SELECT queries) with the provided value. In other >>>> words, whenever a SPARQL processor evaluates a pre-bound variable, it must >>>> use the given value. >>>> >>>> can be considered to be coherent. >>>> >>>> >>>> What is "evaluation time"? It is not defined anywhere. >>>> >>>> How can "prior to evaluating a query" something happen "at evaluation time"? >>>> >>>> How can substitution happen at evaluation time at all? >>>> >>>> >>>> peter >>>> >>>> >>>> >>>> >>>> On 04/17/2016 09:39 PM, Holger Knublauch wrote: >>>>> Updated definition here: >>>>> >>>>> https://github.com/w3c/data-shapes/commit/3ec678b057a50e1911e9ac93b77df394bf1e45ef >>>>> >>>>> >>>>> >>>>> Main paragraph is now: >>>>> >>>>> pre-binding a variable with a value means that, prior to evaluating a query, >>>>> the SHACL processor needs to substitute all occurrences of variables with >>>>> the >>>>> same name at evaluation time (including inner scopes and nested SELECT >>>>> queries) with the provided value. In other words, whenever a SPARQL >>>>> processor >>>>> evaluates a pre-bound variable, it must use the given value. >>>>> >>>>> On 18/04/2016 12:27, Peter F. Patel-Schneider wrote: >>>>>> There are several problems here. >>>>>> >>>>>> 1/ It is unclear what is meant by an occurrence of a variable. Can >>>>>> there be >>>>>> two different variables with the same name in a SPARQL query, as in >>>>>> programming languages? >>>>> I have changed the prose to clarify that we mean variables with the same >>>>> name >>>>> (including those from nested SELECTs). >>>>> >>>>>> 2/ This definition of pre-binding appears to be different from other >>>>>> definitions of pre-binding and different from previous definitions of >>>>>> pre-bindings in the SHACL document. I found a few descriptions of >>>>>> pre-binding. The SPIN submission has one that is very different from this >>>>>> description. Jena appears to have query solution maps which appear to be >>>>>> very >>>>>> different. >>>>>> >>>>>> If SHACL is going to be using something that is different from the usual >>>>>> meaning of pre-binding then it should not be calling it pre-binding. >>>>> There is no established definition of this term anywhere. No other W3C spec >>>>> uses it. I believe we are permitted to define it, and our definition is >>>>> local >>>>> to our document anyway. I also believe most terms will already have a usage >>>>> somewhere else, so we may always conflict. Could you propose a >>>>> non-conflicting >>>>> term? >>>>> >>>>>> 3/ The discussion of pre-binding in 6.2.1 does not match the subsitution >>>>>> description. >>>>> I have deleted the offending sentence and left only the reference to the >>>>> appendix. >>>>> >>>>>> 4/ Textual substitution before SPARQL execution is different from >>>>>> "whenever a >>>>>> SPARQL processor evaluates a pre-bound variable [...] it must use the given >>>>>> value" because some variable mentions in SPARQL code do not evaluate the >>>>>> variable. >>>>> Not sure what you mean here. The spec is no longer referencing textual >>>>> substitution. So has this gone away? >>>>> >>>>>> 5/ Substitution will produce illegal SPARQL for all of the SPARQL >>>>>> definitions >>>>>> of constraint components. >>>>> This is not relevant because the spec does not produce new SPARQL. It >>>>> operates >>>>> "at evaluation time", and I have clarified this in the wording. >>>>> >>>>>> 6/ When the substituted value is a blank node, it will not have the desired >>>>>> meaning. >>>>> Why not? >>>>> >>>>> >>>>>> peter >>>>>> >>>>>> PS: By the way, in SPARQL the ? or $ is not part of the variable so it is >>>>>> not >>>>>> quite correct to talk about variables that start with $. >>>>> Fixed. >>>>> >>>>> Thanks >>>>> Holger >>>>> >>>>> >>>>>> >>>>>> On 04/10/2016 05:18 PM, Holger Knublauch wrote: >>>>>>> (Moved back into an ISSUE-68 thread) >>>>>>> >>>>>>> On 9/04/2016 0:11, Peter F. Patel-Schneider wrote: >>>>>>>>>> I had thought that pre-binding was the easy one. To do pre-binding you >>>>>>>>>> first need to extend SPARQL so that blank nodes can be used in SPARQL >>>>>>>>>> queries, i.e., that if you have access to an RDF graph you can extract >>>>>>>>>> identifiers from that graph and use these identifiers in a SPARQL >>>>>>>>>> query just >>>>>>>>>> as if they were IRIs. Then pre-binding just augments the (outer) >>>>>>>>>> SPARQL >>>>>>>>>> query with a VALUES construct that binds variables to values. >>>>>>>>>> >>>>>>>>>> However, apparently this is not the case, as the current document makes >>>>>>>>>> pre-binding out to be something quite different. I do not have the >>>>>>>>>> expertise to fix all the problems with the treatment of pre-binding in >>>>>>>>>> the >>>>>>>>>> current document but I have pointed out a number of problems in it. >>>>>>>>> This is ISSUE-68. I tried various ways of responding to your concerns, >>>>>>>>> but you >>>>>>>>> were not happy with either. And I agree this is work in progress. I >>>>>>>>> would like >>>>>>>>> to be able to finish this once and for all, but always other things pop >>>>>>>>> up in >>>>>>>>> between. You are raising many other ISSUEs including a full-blown >>>>>>>>> counter >>>>>>>>> proposal that would replace basically everything, and at the same >>>>>>>>> time put >>>>>>>>> pressure on me to not do my homework. It shouldn't come as a surprise >>>>>>>>> that I >>>>>>>>> never have time if I am forced to spend my time responding to all your >>>>>>>>> other >>>>>>>>> issues. Meanwhile, nobody else in the group steps up to this task >>>>>>>>> either. The >>>>>>>>> last time I looked into pre-binding a few weeks ago, I was >>>>>>>>> experimenting with >>>>>>>>> the syntax transform package in Jena. I found a bug that had to be fixed >>>>>>>>> first, halting my progress: >>>>>>>>> >>>>>>>>> https://github.com/apache/jena/commit/bc5ace0e9460ae979079532f610a88b6363e96e5 >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> I then went on vacation and had plenty of other TopQuadrant work on my >>>>>>>>> plate. >>>>>>>>> I will try to get back to this topic soon. >>>>>>>>> >>>>>>>>> At the same time I still do not understand your problem with the >>>>>>>>> semantics of >>>>>>>>> pre-binding. Simply using VALUES is not going to work, because we need >>>>>>>>> to be >>>>>>>>> able to walk into nested scopes and even nested SELECT queries. I had >>>>>>>>> explained this before. Not sure why you keep repeating the same issue. >>>>>>>> Pre-binding is currently defined in a very complex manner. >>>>>>>> >>>>>>>> There is an initial substitution into SPARQL code. This substitution >>>>>>>> changes >>>>>>>> the behaviour of the SPARQL code in many different ways. First there is >>>>>>>> the >>>>>>>> change that would occur if the affected variable had a top-level binding. >>>>>>>> However, there are other changes. Distinct variables with the same >>>>>>>> name in >>>>>>>> sub-queries are also changed. This changes the meaning of sub-queries >>>>>>>> in a >>>>>>>> way different than that of a top-level binding. Second, the substitution >>>>>>>> makes certain bits of previously-valid syntax invalid, including >>>>>>>> bindings, >>>>>>>> GRAPH constructs, the bound function, GROUP BY constructs, and ORDER BY >>>>>>>> constructs. Each of these have to be fixed up by a set of compensating >>>>>>>> code >>>>>>>> transformations. There is no certainty that there are not other >>>>>>>> compensations that need to be made to handle invalid syntax caused by the >>>>>>>> substitution. I can easily think of several - simple variables in select >>>>>>>> clauses, variables in group conditions, variables in bindings, and >>>>>>>> variables >>>>>>>> in data blocks. There could easily be others. There is also no >>>>>>>> certainty >>>>>>>> that the initial substitution does not change the meaning of SPARQL >>>>>>>> code. I >>>>>>>> pointed out above that it does change the meaning of subqueries but there >>>>>>>> could easily be other changes. >>>>>>>> >>>>>>>> Blank nodes then add another complication. The current document does >>>>>>>> not give >>>>>>>> an actual method for handling pre-bound blank nodes. The document >>>>>>>> suggests >>>>>>>> that using an algebra approach would work and so would a substitution >>>>>>>> approach. However, there are no details of how to do either and no >>>>>>>> specification of what either should actually do. >>>>>>> Ok, I have switched to a minimal yet precise definition of pre-binding >>>>>>> now: >>>>>>> >>>>>>> <p> >>>>>>> <span class="term">pre-binding</span> a variable with a >>>>>>> value >>>>>>> means that, prior to evaluating a query, >>>>>>> the SHACL processor needs to substitute all occurrences >>>>>>> of the >>>>>>> variable in the query (including >>>>>>> inner scopes and nested SELECT queries) with the >>>>>>> provided value. >>>>>>> In other words, whenever a SPARQL processor evaluates a >>>>>>> pre-bound variable, it must use the given value. >>>>>>> </p> >>>>>>> >>>>>>> This avoids talking about implementation details. Informally, I could add >>>>>>> that >>>>>>> possible implementation strategies are >>>>>>> - use of VALUES (in simple cases) >>>>>>> - Algebra manipulation (as done by Jena setInitialBindings) >>>>>>> - internal Syntax tree manipulation (as done by Jena syntaxtransforms) >>>>>>> - run-time variable substitution (as done by Sesame) >>>>>>> >>>>>>> This definition eliminates the bnode issues and other problems that you >>>>>>> have >>>>>>> mentioned. I believe it is sufficiently precise to explain the meaning to >>>>>>> users and guide implementers, without over-complicating it. >>>>>>> >>>>>>> What else is missing? >>>>>>> >>>>>>> Holger >>>>>>> >>>>>>> >>> > >
Received on Tuesday, 19 April 2016 02:28:19 UTC