- From: Peter F. Patel-Schneider <pfpschneider@gmail.com>
- Date: Thu, 7 Apr 2016 14:15:22 -0700
- To: RDF Data Shapes Working Group <public-data-shapes-wg@w3.org>
So here are some fundamental problems that I currently see in SHACL. The meaning of SHACL is not well defined. It importantly depends on both pre-binding and sh:hasShape, both of which have significant problems. I had thought that pre-binding was the easy one. To do pre-binding you first need to extend SPARQL so that blank nodes can be used in SPARQL queries, i.e., that if you have access to an RDF graph you can extract identifiers from that graph and use these identifiers in a SPARQL query just as if they were IRIs. Then pre-binding just augments the (outer) SPARQL query with a VALUES construct that binds variables to values. However, apparently this is not the case, as the current document makes pre-binding out to be something quite different. I do not have the expertise to fix all the problems with the treatment of pre-binding in the current document but I have pointed out a number of problems in it. As far as I can tell, sh:hasShape has never had a correct definition in the document. It has severe problems relating to recursion, which I pointed out, and is still described as if arbitrary recursion is part of SHACL. There are other recent problems with the meaning of SHACL. I recently pointed out one of them having to do with nodes in a shape graph that have rdf:type links to both sh:PropertyConstraint and sh:InversePropertyConstraint. The syntax of SHACL is not well defined. The current solution to the problems with nodes that belong to both sh:PropertyConstraint and sh:InversePropertyConstraint is to make them illegal syntax. However, this is quite tricky as SHACL performs several kinds of inference on shapes graphs. Several partial fixes for determining whether a node is a legal value for sh:Property, sh:InverseProperty, or sh:Constraint have been proposed, but all of them have been incomplete and not well founded. None of these fixes have attacked the underlying problem which is that the syntactic category of a constraint node is partly based on rdf:type links of that node and partly based on how that node fits into a shape. This split in syntactic determination makes for a complex, error-prone, and hard to understand syntax. There are other problems with the syntax that may not be individually fundamental, but together are quite significant. Lists are used in various places in the syntax. Several constraint components have lists as values of their main property. However, there is no definition in the document as to what make a valid list, or even any definition of what constitute the members of a list. The syntax has several unnecessary restrictions. It is not possible to repeat properties in constraints (but it is almost necessary to repreat properties in shapes). Constraints and shapes are different, leading to verbose syntax, even for an RDF encoding. peter
Received on Thursday, 7 April 2016 21:15:52 UTC