Re: ISSUE-68: Simpler definition of pre-binding from Peter F. Patel-Schneider on 2016-04-18 (public-data-shapes-wg@w3.org from April 2016)

From: Peter F. Patel-Schneider <pfpschneider@gmail.com>
Date: Mon, 18 Apr 2016 08:39:58 -0700
To: Holger Knublauch <holger@topquadrant.com>, public-data-shapes-wg@w3.org
Message-ID: <5714FFCE.9010908@gmail.com>
I see several problems with this wording:

1/ SPARQL only performs evaluation in certain situations.  For example, in

SELECT $this WHERE { $this :p ?that }

neither this nor that are evaluated at any time.

Some other wording is needed.

2/ This description appears to be written to that the $this and $predicate in
the subquery are affected even though they are effectively different
variables from the ?this and ?predicate in the main query.

SELECT $this ($this AS ?subject) $predicate
WHERE {
 {
  SELECT (COUNT(?value) AS ?count)
  WHERE {
   $this $predicate ?value .
  }
 }
 FILTER (?count < $minCount)
}

Is this what pre-binding in SPARQL is supposed to do?  If not, some other term
should
be used.

Can SPARQL implementations do this at all in an interoperable fashion?

3/ What sorts of values are allowed in pre-binding?

4/ When do "SHACL processors" evaluate occurrences of variables?

5/ How does this work with blank nodes?



I'm having a hard time finding descriptions of pre-binding for SPARQL
implementations.  Are there any decent descriptions available?


peter







On 04/18/2016 06:49 AM, Holger Knublauch wrote:
> Oops, yes. I should have taken out the "prior...". Let me try again:
> 
> <span class="term">Pre-binding</span> a variable with a value means that
> the SHACL processor needs to evaluate all
> occurrences of variables with that same name
> (including occurrences in inner scopes and nested SELECT queries)
> so that they have the provided value.
> In other words, whenever a SPARQL processor evaluates a pre-bound
> variable, it must use the given value.
> 
> I don't see why the term "evaluation time" would be unclear. A SPARQL engine
> evaluates a query and this happens during a process that takes time.
> 
> I replaced the term "substitution", so that people don't assume query text
> replacement.
> 
> Does the second sentence "In other words..." help or shall I delete that?
> 
> Thanks,
> Holger
> 
> 
> On 18/04/2016 21:57, Peter F. Patel-Schneider wrote:
>> I don't see how this wording, which appears to be
>>
>> <span class="term">pre-binding</span> a variable with a value means that,
>> prior to evaluating a query, the SHACL processor needs to substitute all
>> occurrences of variables with the same name at evaluation time (including
>> inner scopes and nested SELECT queries) with the provided value.  In other
>> words, whenever a SPARQL processor evaluates a pre-bound variable, it must
>> use the given value.
>>
>> can be considered to be coherent.
>>
>>
>> What is "evaluation time"?  It is not defined anywhere.
>>
>> How  can "prior to evaluating a query" something happen "at evaluation time"?
>>
>> How can substitution happen at evaluation time at all?
>>
>>
>> peter
>>
>>
>>
>>
>> On 04/17/2016 09:39 PM, Holger Knublauch wrote:
>>> Updated definition here:
>>>
>>> https://github.com/w3c/data-shapes/commit/3ec678b057a50e1911e9ac93b77df394bf1e45ef
>>>
>>>
>>> Main paragraph is now:
>>>
>>> pre-binding a variable with a value means that, prior to evaluating a query,
>>> the SHACL processor needs to substitute all occurrences of variables with the
>>> same name at evaluation time (including inner scopes and nested SELECT
>>> queries) with the provided value. In other words, whenever a SPARQL processor
>>> evaluates a pre-bound variable, it must use the given value.
>>>
>>> On 18/04/2016 12:27, Peter F. Patel-Schneider wrote:
>>>> There are several problems here.
>>>>
>>>> 1/ It is unclear what is meant by an occurrence of a variable.   Can there be
>>>> two different variables with the same name in a SPARQL query, as in
>>>> programming languages?
>>> I have changed the prose to clarify that we mean variables with the same name
>>> (including those from nested SELECTs).
>>>
>>>> 2/ This definition of pre-binding appears to be different from other
>>>> definitions of pre-binding and different from previous definitions of
>>>> pre-bindings in the SHACL document.  I found a few descriptions of
>>>> pre-binding.  The SPIN submission has one that is very different from this
>>>> description.  Jena appears to have query solution maps which appear to be
>>>> very
>>>> different.
>>>>
>>>> If SHACL is going to be using something that is different from the usual
>>>> meaning of pre-binding then it should not be calling it pre-binding.
>>> There is no established definition of this term anywhere. No other W3C spec
>>> uses it. I believe we are permitted to define it, and our definition is local
>>> to our document anyway. I also believe most terms will already have a usage
>>> somewhere else, so we may always conflict. Could you propose a non-conflicting
>>> term?
>>>
>>>> 3/ The discussion of pre-binding in 6.2.1 does not match the subsitution
>>>> description.
>>> I have deleted the offending sentence and left only the reference to the
>>> appendix.
>>>
>>>> 4/ Textual substitution before SPARQL execution is different from
>>>> "whenever  a
>>>> SPARQL processor evaluates a pre-bound variable [...] it must use the given
>>>> value" because some variable mentions in SPARQL code do not evaluate the
>>>> variable.
>>> Not sure what you mean here. The spec is no longer referencing textual
>>> substitution. So has this gone away?
>>>
>>>> 5/ Substitution will produce illegal SPARQL for all of the SPARQL definitions
>>>> of constraint components.
>>> This is not relevant because the spec does not produce new SPARQL. It operates
>>> "at evaluation time", and I have clarified this in the wording.
>>>
>>>> 6/ When the substituted value is a blank node, it will not have the desired
>>>> meaning.
>>> Why not?
>>>
>>>
>>>> peter
>>>>
>>>> PS:  By the way, in SPARQL the ? or $ is not part of the variable so it is
>>>> not
>>>> quite correct to talk about variables that start with $.
>>> Fixed.
>>>
>>> Thanks
>>> Holger
>>>
>>>
>>>>
>>>>
>>>> On 04/10/2016 05:18 PM, Holger Knublauch wrote:
>>>>> (Moved back into an ISSUE-68 thread)
>>>>>
>>>>> On 9/04/2016 0:11, Peter F. Patel-Schneider wrote:
>>>>>>>> I had thought that pre-binding was the easy one.  To do pre-binding you
>>>>>>>> first need to extend SPARQL so that blank nodes can be used in SPARQL
>>>>>>>> queries, i.e., that if you have access to an RDF graph you can extract
>>>>>>>> identifiers from that graph and use these identifiers in a SPARQL
>>>>>>>> query just
>>>>>>>> as if they were IRIs.  Then pre-binding just augments the (outer) SPARQL
>>>>>>>> query with a VALUES construct that binds variables to values.
>>>>>>>>
>>>>>>>> However, apparently this is not the case, as the current document makes
>>>>>>>> pre-binding out to be something quite different.  I do not have the
>>>>>>>> expertise to fix all the problems with the treatment of pre-binding in
>>>>>>>> the
>>>>>>>> current document but I have pointed out a number of problems in it.
>>>>>>> This is ISSUE-68. I tried various ways of responding to your concerns,
>>>>>>> but you
>>>>>>> were not happy with either. And I agree this is work in progress. I
>>>>>>> would like
>>>>>>> to be able to finish this once and for all, but always other things pop
>>>>>>> up in
>>>>>>> between. You are raising many other ISSUEs including a full-blown counter
>>>>>>> proposal that would replace basically everything, and at the same time put
>>>>>>> pressure on me to not do my homework. It shouldn't come as a surprise
>>>>>>> that I
>>>>>>> never have time if I am forced to spend my time responding to all your
>>>>>>> other
>>>>>>> issues. Meanwhile, nobody else in the group steps up to this task
>>>>>>> either. The
>>>>>>> last time I looked into pre-binding a few weeks ago, I was
>>>>>>> experimenting with
>>>>>>> the syntax transform package in Jena. I found a bug that had to be fixed
>>>>>>> first, halting my progress:
>>>>>>>
>>>>>>> https://github.com/apache/jena/commit/bc5ace0e9460ae979079532f610a88b6363e96e5
>>>>>>>
>>>>>>>
>>>>>>> I then went on vacation and had plenty of other TopQuadrant work on my
>>>>>>> plate.
>>>>>>> I will try to get back to this topic soon.
>>>>>>>
>>>>>>> At the same time I still do not understand your problem with the
>>>>>>> semantics of
>>>>>>> pre-binding. Simply using VALUES is not going to work, because we need
>>>>>>> to be
>>>>>>> able to walk into nested scopes and even nested SELECT queries. I had
>>>>>>> explained this before. Not sure why you keep repeating the same issue.
>>>>>> Pre-binding is currently defined in a very complex manner.
>>>>>>
>>>>>> There is an initial substitution into SPARQL code.  This substitution
>>>>>> changes
>>>>>> the behaviour of the SPARQL code in many different ways.  First there is
>>>>>> the
>>>>>> change that would occur if the affected variable had a top-level binding.
>>>>>> However, there are other changes.   Distinct variables with the same
>>>>>> name in
>>>>>> sub-queries are also changed.  This changes the meaning of sub-queries in a
>>>>>> way different than that of a top-level binding.  Second, the substitution
>>>>>> makes certain bits of previously-valid syntax invalid, including bindings,
>>>>>> GRAPH constructs, the bound function, GROUP BY constructs, and ORDER BY
>>>>>> constructs.  Each of these have to be fixed up by a set of compensating
>>>>>> code
>>>>>> transformations.   There is no certainty that there are not other
>>>>>> compensations that need to be made to handle invalid syntax caused by the
>>>>>> substitution.  I can easily think of several - simple variables in select
>>>>>> clauses, variables in group conditions, variables in bindings, and
>>>>>> variables
>>>>>> in data blocks.  There could easily be others.  There is also no certainty
>>>>>> that the initial substitution does not change the meaning of SPARQL
>>>>>> code.  I
>>>>>> pointed out above that it does change the meaning of subqueries but there
>>>>>> could easily be other changes.
>>>>>>
>>>>>> Blank nodes then add another complication.  The current document does
>>>>>> not give
>>>>>> an actual method for handling pre-bound blank nodes.  The document suggests
>>>>>> that using an algebra approach would work and so would a substitution
>>>>>> approach.  However, there are no details of how to do either and no
>>>>>> specification of what either should actually do.
>>>>> Ok, I have switched to a minimal yet precise definition of pre-binding now:
>>>>>
>>>>>              <p>
>>>>>                  <span class="term">pre-binding</span> a variable with a
>>>>> value
>>>>> means that, prior to evaluating a query,
>>>>>                  the SHACL processor needs to substitute all occurrences
>>>>> of the
>>>>> variable in the query (including
>>>>>                  inner scopes and nested SELECT queries) with the
>>>>> provided value.
>>>>>                  In other words, whenever a SPARQL processor evaluates a
>>>>> pre-bound variable, it must use the given value.
>>>>>              </p>
>>>>>
>>>>> This avoids talking about implementation details. Informally, I could add
>>>>> that
>>>>> possible implementation strategies are
>>>>> - use of VALUES (in simple cases)
>>>>> - Algebra manipulation (as done by Jena setInitialBindings)
>>>>> - internal Syntax tree manipulation (as done by Jena syntaxtransforms)
>>>>> - run-time variable substitution (as done by Sesame)
>>>>>
>>>>> This definition eliminates the bnode issues and other problems that you have
>>>>> mentioned. I believe it is sufficiently precise to explain the meaning to
>>>>> users and guide implementers, without over-complicating it.
>>>>>
>>>>> What else is missing?
>>>>>
>>>>> Holger
>>>>>
>>>>>
> 
>
Received on Monday, 18 April 2016 15:40:32 UTC