Re: Formal objection on syntax checking in SHACL

Peter,

Given well formed shapes, all SHACL processors will reach the same conclusion as to the validity of data. Thus, all well formed shapes are interoperable.

WG does agree that the syntactic checking of shapes graphs is a useful thing to have. WG doesn’t believe that having it is absolutely necessary for the first release of SHACL nor do we want to mandate implementing it as a compliance requirement at this point. We have started a list of such useful “future work” items to be undertaken by a community group or a future working group https://www.w3.org/2014/data-shapes/wiki/Postponed <https://www.w3.org/2014/data-shapes/wiki/Postponed>. Standards are living things. They can and should evolve with use and experience.

Further, you seem to be taking a leap in assuming that the syntax checking tool would not become available unless the syntax checking was required by the standard. For example, you have already started such an implementation. We hope and believe that other community members may do the same. 

Regards,

Irene

> On Feb 22, 2017, at 11:02 PM, Peter F. Patel-Schneider <pfpschneider@gmail.com> wrote:
> 
> This is a formal objection to the decision of the RDF Data Shapes working
> group to close ISSUE-233 without requiring that all SHACL implementations
> check for correct SHACL syntax.
> 
> 
> SHACL implementations need to provide an interface that signals whether the
> shapes graph argument to validation conforms to the syntactic requirements
> of SHACL Core, for SHACL Core implementations, or SHACL-SPARQL, for
> SHACL-SPARQL implementations.
> 
> For example, it is currently possible for SHACL implementations to implement
> the following shape as requiring that all SHACL instances of ex:C1 be any of
> ex:i1, ex:i2, ex:i3; or as requiring that all SHACL instances of ex:C1 be
> either ex:i1 or ex:i2; or by signalling a syntax error; or indeed by any
> behaviour whatever.
> 
> se:s1 rdf:type sh:NodeShape ;
>  sh:targetClass ex:C1 ;
>  sh:in _:b1 .
> _:b1 rdf:first ex:i1 .
> _:b1 rdf:rest _:b2 .
> _:b2 rdf:first ex:i2 .
> _:b1 rdf:rest _:b3 .
> _:b3 rdf:first ex:i3 .
> _:b3 rdf:rest rdf:nil .
> 
> The first of these permissable behaviours is the same as the required
> behaviour on the following shapes graph
> 
> se:s1 rdf:type sh:NodeShape ;
>  sh:targetClass ex:C1 ;
>  sh:in _:b1 .
> _:b1 rdf:first ex:i1 .
> _:b1 rdf:rest _:b2 .
> _:b2 rdf:first ex:i2 .
> _:b2 rdf:rest _:b3 .
> _:b3 rdf:first ex:i3 .
> _:b3 rdf:rest rdf:nil .
> 
> Users will thus have no tool to determine that their shapes graphs are
> syntactically invalid and thus that their shapes graphs will be processed
> the same in different SHACL implementations.
> 
> 
> Therefore if there is not a requirement for complete syntax checking
> interoperability will be severely compromised.  Users of SHACL
> implementations that do not signal whether the shapes graph conforms to the
> syntactic requirements of SHACL Core or SHACL-SPARQL will have no tool for
> determining whether their shapes graphs, whether they conform to SHACL
> syntactic requirements or not, can be interoperably processed by other SHACL
> implementations.
> 
> Optional signalling whether the shapes conforms to SHACL syntactic
> requirements, as in the resolution of ISSUE-233, is in no way a replacement
> for the requirement that all SHACL implementations check for correct SHACL
> syntax.  Users need to be able to know whether their shapes graph conforms
> to SHACL syntactic requirements or not.
> 
> 
> Implementing a complete check for SHACL Core is quite easy.  SHACL Core
> implementations will of necessity have to check most of the syntactic
> requirements of SHACL Core anyway.  For example, an implementation
> will have to check whether a property path refers to itself, or will go into
> an infinite loop.
> 
> The major place where implementations can get away without complete SHACL
> Core syntax checking is SHACL lists, as a simple SPARQL query can retrieve
> elements of SHACL lists without doing complete syntax checking for SHACL
> lists.  However, it is quite easy to implement complete syntax checking for
> SHACL lists, for example by the following Python program (that uses rdflib)
> 
> class SHACLSyntaxError(Error) :
>    def __init__(self,message) : self.message = message
> 
> def countValues(g,subject,predicate) :
>    return sum( 1 for _ in g.objects(subject,predicate) )
> 
> def wflist(g,node,elements=[]) :
>    if ( node in elements ) : raise SHACLSyntaxError("Circular list")
>    if ( node == RDF.nil ) :
>        if ( (RDF.nil,RDF.first,None) in graph ) :
>            raise SHACLSyntaxError("rdf:nil has rdf:first")
>        if ( (RDF.nil,RDF.next,None) in graph ) :
>            raise SHACLSyntaxError("rdf:nil has rdf:next")
>        return elements
>    if ( countValue(g,node,RDF.first) != 1 ) :
>        raise SHACLSyntaxError("List element has wrong number of values for
> rdf:first")
>    if ( countValue(g,node,RDF.rest) != 1 ) :
>        raise SHACLSyntaxError("List element has wrong number of values for
> rdf:rest")
>    return wflist(g,g.value(node,RDF.rest),
>                  elements.append(g.value(node,RDF.first)))
> 
> Implementing a complete syntax check for SHACL-SPARQL is more difficult,
> but the structural requirements are not too hard and the SPARQL syntax
> checks can be done by a SPARQL implementation, which a SHACL-SPARQL needs to
> use anway.
> 
> 
> It is not neceesary that SHACL implementations always check for correct
> SHACL syntax, just that they can be made to do so.  If a SHACL
> implementation finds it too computationally expensive to always check for
> correct SHACL syntax it could provide an "unsafe" interface that does not do
> complete syntax checking.
> 
> 
> Complete syntax checking is neither difficult nor expensive.  Not requiring
> complete syntax checking produces severe interoperability problems.
> Complete syntax checking thus needs to be a requirement on all SHACL
> implementations.
> 
> 
> Peter F. Patel-Schneider
> Nuance Communications
> 

Received on Thursday, 23 February 2017 21:53:14 UTC