fundamental problems with SHACL

So here are some fundamental problems that I currently see in SHACL.


The meaning of SHACL is not well defined.  It importantly depends on both
pre-binding and sh:hasShape, both of which have significant problems.

I had thought that pre-binding was the easy one.  To do pre-binding you
first need to extend SPARQL so that blank nodes can be used in SPARQL
queries, i.e., that if you have access to an RDF graph you can extract
identifiers from that graph and use these identifiers in a SPARQL query just
as if they were IRIs.  Then pre-binding just augments the (outer) SPARQL
query with a VALUES construct that binds variables to values.

However, apparently this is not the case, as the current document makes
pre-binding out to be something quite different.  I do not have the
expertise to fix all the problems with the treatment of pre-binding in the
current document but I have pointed out a number of problems in it.

As far as I can tell, sh:hasShape has never had a correct definition in the
document.  It has severe problems relating to recursion, which I pointed
out, and is still described as if arbitrary recursion is part of SHACL.

There are other recent problems with the meaning of SHACL.  I recently
pointed out one of them having to do with nodes in a shape graph that have
rdf:type links to both sh:PropertyConstraint and
sh:InversePropertyConstraint.


The syntax of SHACL is not well defined.

The current solution to the problems with nodes that belong to  both
sh:PropertyConstraint and sh:InversePropertyConstraint is to make them
illegal syntax.  However, this is quite tricky as SHACL performs several
kinds of inference on shapes graphs.  Several partial fixes for determining
whether a node is a legal value for sh:Property, sh:InverseProperty, or
sh:Constraint have been proposed, but all of them have been incomplete and
not well founded.

None of these fixes have attacked the underlying problem which is that the
syntactic category of a constraint node is partly based on rdf:type links of
that node and partly based on how that node fits into a shape.  This split
in syntactic determination makes for a complex, error-prone, and hard to
understand syntax.


There are other problems with the syntax that may not be individually
fundamental, but together are quite significant.

Lists are used in various places in the syntax.  Several constraint
components have lists as values of their main property.  However, there is
no definition in the document as to what make a valid list, or even any
definition of what constitute the members of a list.

The syntax has several unnecessary restrictions.  It is not possible to
repeat properties in constraints (but it is almost necessary to repreat
properties in shapes).  Constraints and shapes are different, leading to
verbose syntax, even for an RDF encoding.


peter

Received on Thursday, 7 April 2016 21:15:52 UTC