Re: ISSUE-68 definition of pre-binding from Peter F. Patel-Schneider on 2016-03-22 (public-data-shapes-wg@w3.org from March 2016)

From: Peter F. Patel-Schneider <pfpschneider@gmail.com>
Date: Mon, 21 Mar 2016 17:01:32 -0700
To: Holger Knublauch <holger@topquadrant.com>, RDF Data Shapes Working Group <public-data-shapes-wg@w3.org>
Message-ID: <56F08B5C.5000606@gmail.com>
On 03/21/2016 04:21 PM, Holger Knublauch wrote:
> On 22/03/2016 4:08, Peter F. Patel-Schneider wrote:
>> The definition of pre-binding in the current editors' draft says
>>
>> pre-binding a variable with a value means that, prior to evaluating a query,
>> the SHACL processor needs to substitute all occurrences of the variable in the
>> query (including inner scopes and nested SELECT queries) with the provided
>> value.
>>
>> This does not match my intuitions on how pre-binding should work.
>>
>> It may match what happens in practice, but I think that for this definition to
>> be acceptable there will have to be a determination that most SPARQL
>> implementations use this definition.
> 
> There are no existing implementations of SHACL, and while I agree it would be
> ideal if we could simply hook into existing implementations of pre-binding, I
> don't see how we can realistically make this a requirement.

I disagree here.  If what the SHACL spec calls pre-binding is different from
what SPARQL implementations do (and advertise) as pre-binding then we will be
in another situation where the words in the SHACL spec don't mean what readers
think they mean.  If SHACL is going to depend on something that it calls
pre-binding and many SPARQL implementations define something that they call
pre-binding then the two should match up.

> Having said this, Jena has 3 different implementations of pre-binding:
> 1) QueryExecution.setInitialBindings (used by SPIN)
> 2) Parameterized SPARQL strings (text-based, does not support bnodes)
> 3) Query syntax tree transform (this is closest to what SHACL would use)
> AFAIK Sesame uses a similar technique to 3, i.e. it inserts variable values
> into a parsed Query syntax tree. Then, whenever a variable is queried, the
> system will check if that ?var has a pre-bound value already.

Hmm.  Then it is probably better to not use the term pre-binding at all.  Call
it something else, like substitution.

It seems to me that substitution into the syntax tree is under-defined, just
like the definition in Appendix C is under-defined.  There are two notions of
variable identity possible here, one that is strictly based on names and one
where there may be two variables with the same name.  If the SHACL document
depends on variable identity then it will depend on which notion is being used.

The current text has this ambiguity in it.  I would read the definition as
saying that there are only two substitutions done in

  SELECT ?a
  WHERE { ?a ex:r ex: c .
          { SELECT ?b
            WHERE { ?b ex:r ?a .
                    { SELECT ?b
                      WHERE { ?b ex:q ?a } } } } }

because the innermost ?a is a different variable.

> Instead of relying on the textual syntax of SPARQL (which introduces problems
> such as bnodes), could we describe the desired behavior in terms of how Sesame
> does it? I.e. along the lines of "Given an internal representation of a SPARQL
> query (such as Algebra or Query objects in Java) pre-binding has the effect
> that the evaluation of a variable returns the constant of the pre-binding.".

This still depends on variable identity.

>> There is also no indication of when invalid pre-bindings are supposed to be
>> reported or how.
> 
> The Appendix tried to enumerate those cases. I am not sure what else needs to
> be said here. Do you want us to clarify what kind of error needs to be
> reported and when? We don't do that in other places either.

It matters here.

Suppose I write

  SELECT ?a
  WHERE { { ?a ex:b ex:c }
          FILTER ( true || EXISTS { GRAPH ?v { ?a ex:b ex:c } } ) }

and pre-bind ?v with a blank node.  If the error is supposed to be caught at
pre-binding time then there will be no solutions.  If the error is supposed to
be caught at query execution time then there may well be solutions.

>> The appendix on pre-binding should be sent by several SPARQL experts to see if
>> they think that it is reasonable.
> 
> Yes, that should definitely be part of the review. I asked before whether W3C
> has any process to gather such feedback.

I believe that there is a formal way of requesting feedback from other W3C
groups that are currently active.  However, I do not know whether there are
any active groups that are appropriate to ask.

There is a way to require feedback.  The working group can put explicit
conditions on exit gates and requiring feedback could be one that this working
group could use.

> Holger

peter
Received on Tuesday, 22 March 2016 00:02:04 UTC