formal objection on interoperability from Peter F. Patel-Schneider on 2017-05-05 (public-rdf-shapes@w3.org from May 2017)

From: Peter F. Patel-Schneider <pfpschneider@gmail.com>
Date: Fri, 5 May 2017 07:17:50 -0700
To: "public-rdf-shapes@w3.org" <public-rdf-shapes@w3.org>
Message-ID: <57a34f83-b047-e2b7-d7a0-95a925a13e0c@gmail.com>
This is a formal objection to the current state of interoperability between
SHACL implementations.  Interoperability should be a prime concern of W3C
recommendations and is an absolute requirement for use of SHACL in Nuance.
The current design of SHACL has several severe interoperability problems
that can result in fully conforming SHACL implementations producing
different results on the same inputs without signalling an error or warning.
Because of the complexity of the SHACL syntax, users are likely to
accidentally create shapes graphs where these interoperability problems
occur.


1/ SHACL implementations are allowed to produce different results for shapes
graphs that do not conform to the SHACL syntax.  This source of
interoperability failures was previously reported but the purported remedy
is inadequate and changes to the SHACL syntax made at CR have made it worse.
The most serious interoperability problem here is that different SHACL
implementations may produce different results without signalling an error.

Consider the following shapes graph
  ex:s1 a sh:PropertyShape ;
    sh:targetClass ex:C ;
    sh:path ex:p ;
    sh:datatype xsd:string ;
    sh:pattern "abcde(fg" .
and the following data graph
  ex:i a ex:C ; ex:p 5 .
  ex:j a ex:C ; ex:p "abc" .

One SHACL implementation may silently ignore the shape entirely and report
no violations when validating this shapes graph against this data
graph.  A second implementation may silently treat ill-formed patterns as
failing and report two violations.  A third may silently ignore just
ill-formed patterns and report a violation for ex:i.  All three
implementations are acting in full conformance to the SHACL specification.

The interoperability failure for shapes graph that do not conform to the
SHACL syntax is made worse by the unnecessary complexity of the SHACL
syntax.  There are many shapes graphs that make sense for SHACL but do not
conform to the SHACL syntax.  For example RDFS comments are prohibited in
certain SHACL constructs so the following shapes graph does not conform to
the SHACL syntax
  ex:comments a sh:PropertyShape ;
    sh:path [ rdfs:comment "inverse of ex:p" ;
           sh:inversePath ex:p ] ;
    sh:class ex:C .
Different fully conforming SHACL implementations can produce different
results on this shapes graph.  Because of syntax issues like this users are
very likely to write shapes graphs where fully conforming SHACL
implementations can produce different violations.

2/ SHACL Core implementations are required to not signal an error or warning
when SHACL-SPARQL constructs appear in the shapes graph and to silently
ignore such constructs.

Consider the following well-formed shapes graph
  ex:s1 a sh:NodeShape ;
    sh:targetClass ex:C ;
    sh:class ex:D ;
    sh:sparql "SELECT $this WHERE { }" .
and the following data graph
  ex:i a ex:C .
  ex:j a ex:C , ex:D .

A SHACL Core implementation always reports only one violation when
validating this shapes graph against this data graph.  A SHACL-SPARQL
implementation is required to produce more than one violation.  There are
also shapes where a SHACL Core implementation produces more violations than
a SHACL-SPARQL implementation.  Again there is silent disagreement between
fully-conformant SHACL implementations.

3/ The behaviour of SHACL implementations on recursive shapes is undefined.
SHACL implementations that do not handle recursive shapes do not need to
signal when they encounter recursive shapes.

Consider the following shapes graph
  ex:s1 a sh:PropertyShape ;
    sh:targetClass ex:C ;
    sh:path ex:p ;
    sh:property ex:s1 ;
    sh:class ex:D .
and the following data graph
  ex:i a ex:C, ex:D ; ex:p ex:i .
  ex:j a ex:C .

A SHACL Core implementation that does not handle recursive shapes may
silently ignore this shape and report no violations.  A SHACL Core
implementation that does handle recursive shapes may report a violation for
ex:i.


The syntax of SHACL is complex.  It will be hard for users to determine
whether the shapes graph they create conforms is well-formed, and thus can
trigger interoperability problems, or contains some of the other sources of
interoperability problems.  It is easy to create the shapes graphs that
trigger interoperability problems.  Mechanisms for alleviating these
interoperability problems thus need to be a part of SHACL.

The creation of a shapes graph that does partial syntax checking of SHACL
Core shapes does not fix these interoperability problems.  The shapes graph
is not adequate for complete syntax checking of SHACL Core.  It is missing
some of the most important, and hardest for users to check, aspects of
SHACL, including patterns, circular lists and paths, and recursive shapes.
It does not do any checking at all of SHACL-SPARQL.


The solution to these serious failures of interoperability is quite simple.
SHACL implementations need to be required to have a mode (like strict mode
in many compilers) where they signal whenever they are given a shapes graph
that contains ill-formed SHACL constructs or SHACL constructs that they do
not handle.  The coding cost of this detection is quite minimal, at least
for SHACL Core, about sixty lines of Python code to implement full detection
of violations of the SHACL Core syntax over what is needed to ensure proper
operation of a SHACL Core implementation.  If the run-time cost of this
detection is too high for shapes graphs that are used over and over, SHACL
implementations could provide a deployment mode where this check is not
performed.
Received on Friday, 5 May 2017 14:18:26 UTC