revised formal objection on pre-binding from Peter F. Patel-Schneider on 2017-05-09 (public-rdf-shapes@w3.org from May 2017)

From: Peter F. Patel-Schneider <pfpschneider@gmail.com>
Date: Tue, 9 May 2017 04:01:36 -0700
To: "public-rdf-shapes@w3.org" <public-rdf-shapes@w3.org>
Message-ID: <0f5f6ee2-0f43-a7d1-746f-5ab9d6cf62b5@gmail.com>
This is a revised formal objection to the definition of pre-binding approved
by the working group in its meeting of 3 May 2017 and in the SHACL Editor's
Draft current as of 7 May 2017.

Problems with pre-binding have been known to the working group since at
least June of 2015.   There have been multiple definitions of pre-binding in
SHACL.  Each of these definitions has had serious problems.

The current definition of pre-binding has multiple documented problems,
several known from before the transition to candidate recommendation and
still not fixed.  These problems affect the MINUS, SERVICE, VALUES, and BIND
constructs in SPARQL as well as subqueries.  The only change to pre-binding
since candidate recommendation has been to paper over some of its problems
by excluding large numbers of SPARQL queries from SHACL-SPARQL.  These
exclusions remove many useful SPARQL queries from SHACL-SPARQL, including
the SPARQL query that was in the SHACL document to provide an alternative
definition of the semantics of sh:equals.

The current definition of pre-binding also does not produce the results
needed in many places where pre-binding shows up in the SHACL document.
This indicates that there has been no effective review of the current
definition of pre-binding to check whether it does what it is supposed to
do, a very surprising situation for such a central part of SHACL-SPARQL.

If pre-binding is to continue to be a part of SHACL, it needs to be given a
new, suitable definition and then go through a competent internal review
followed by a wide external review before SHACL becomes a candidate
recommendation again.


Here are a few of the uses of pre-binding in the SHACL document as of 07 May
2017 where it produces unsuitable or unexpected results.

The very first use of pre-binding in the document produces completely
unsuitable results.
  SELECT DISTINCT ?this WHERE { BIND ($targetNode AS ?this) }
with the variable targetNode pre-bound to some RDF term will produce a
solution sequence containing a single solution.  That solution will have
the variable this unbound.

The next use of pre-binding also produces completely unsuitable results.
  SELECT DISTINCT ?this WHERE {
    ?this rdf:type/rdfs:subClassOf* $targetClass .
    }
with the variable targetClass pre-bound to some RDF term will produce a
solution sequence containing solutions binding the variable this to each
node in the graph that is the subject of a triple in the graph with rdf:type
as predicate.

The second-last use of pre-binding also produces unexpected results.  The
results of
  SELECT DISTINCT $this ?value WHERE {
    $this $PATH ?value .
    FILTER (!isLiteral(?value) || !langMatches(lang(?value), $lang))
    }
are independent of the pre-binding of the variable lang.  This query forms
the basis of one of the SHACL-SPARQL tests, with expected results
dramatically different from the ones actually produced by the current
definition of pre-binding.  SHACL-SPARQL implementations thus are not
implementing pre-binding as currently defined.


Although this is nowhere stated in the SHACL document, it appears that the
indent of the current definition of pre-binding is to make the binding of a
pre-bound variable available everywhere in the query except for in
subqueries that do not project the variable.  The current definition of
pre-binding tries to achieve this by two main mechanisms.  The first
mechanism is renaming of variables in subqueries that are not projected to
fresh variables.  The second mechanism joins pre-bound bindings to all basic
graph patterns, property path expressions, and named graph patterns in the
query.

However, the current definition of pre-binding fails to achieve the goal
stated above.  The bindings of pre-bound variables are sometimes not
available when needed.  This causes problems in a large number of queries,
many of which have been already pointed out.  Also the pre-bound bindings
are available even in places where none of the variables are in scope.  This
causes problems for MINUS.  The pre-bound bindings are available in
subqueries, which hinders bottom-up evaluation of subqueries.

The recent changes to the pre-binding section of the SHACL document do not
actually change the definition of pre-binding to fix these problems but
instead just exclude many SPARQL queries from SHACL, presumably to keep only
non-problematic queries.  The exclusions are very broad, and eliminate many
queries that do not have any problems for the current definition of
pre-binding.  The exclusions are so broad that they even exclude the example
potential definitional query for sh:EqualsConstraintComponent that was in
the SHACL document.


The solution to the continuing problems with pre-binding is not to exclude
more and more of SPARQL but to either fix pre-binding so that it works
correctly or eliminate pre-binding from SHACL.  Papering over the problems
with the current definition of pre-binding, if even possible, is not a
suitable solution.

Peter F. Patel-Schneider
Nuance Communications
Received on Tuesday, 9 May 2017 11:02:14 UTC