Re: fundamental problems with SHACL from Holger Knublauch on 2016-04-08 (public-data-shapes-wg@w3.org from April 2016)

From: Holger Knublauch <holger@topquadrant.com>
Date: Fri, 8 Apr 2016 10:20:41 +1000
To: public-data-shapes-wg@w3.org
Message-ID: <5706F959.7010506@topquadrant.com>
Hi Peter,

I believe you are repeating similar things over and over again. Is there 
are reason for that, other than reminding the group of some perceived 
urgency?

On 8/04/2016 7:15, Peter F. Patel-Schneider wrote:
> So here are some fundamental problems that I currently see in SHACL.
>
>
> The meaning of SHACL is not well defined.  It importantly depends on both
> pre-binding and sh:hasShape, both of which have significant problems.

None of these are IMHO significant. It's just editorial work that needs 
to happen, and we all have day jobs in parallel.

>
> I had thought that pre-binding was the easy one.  To do pre-binding you
> first need to extend SPARQL so that blank nodes can be used in SPARQL
> queries, i.e., that if you have access to an RDF graph you can extract
> identifiers from that graph and use these identifiers in a SPARQL query just
> as if they were IRIs.  Then pre-binding just augments the (outer) SPARQL
> query with a VALUES construct that binds variables to values.
>
> However, apparently this is not the case, as the current document makes
> pre-binding out to be something quite different.  I do not have the
> expertise to fix all the problems with the treatment of pre-binding in the
> current document but I have pointed out a number of problems in it.

This is ISSUE-68. I tried various ways of responding to your concerns, 
but you were not happy with either. And I agree this is work in 
progress. I would like to be able to finish this once and for all, but 
always other things pop up in between. You are raising many other ISSUEs 
including a full-blown counter proposal that would replace basically 
everything, and at the same time put pressure on me to not do my 
homework. It shouldn't come as a surprise that I never have time if I am 
forced to spend my time responding to all your other issues. Meanwhile, 
nobody else in the group steps up to this task either. The last time I 
looked into pre-binding a few weeks ago, I was experimenting with the 
syntax transform package in Jena. I found a bug that had to be fixed 
first, halting my progress:

https://github.com/apache/jena/commit/bc5ace0e9460ae979079532f610a88b6363e96e5

I then went on vacation and had plenty of other TopQuadrant work on my 
plate. I will try to get back to this topic soon.

At the same time I still do not understand your problem with the 
semantics of pre-binding. Simply using VALUES is not going to work, 
because we need to be able to walk into nested scopes and even nested 
SELECT queries. I had explained this before. Not sure why you keep 
repeating the same issue.

>
> As far as I can tell, sh:hasShape has never had a correct definition in the
> document.  It has severe problems relating to recursion, which I pointed
> out, and is still described as if arbitrary recursion is part of SHACL.

This is ISSUE-131 which I have addressed today. We should continue 
discussion on that thread:

https://lists.w3.org/Archives/Public/public-data-shapes-wg/2016Apr/0026.html

>
> There are other recent problems with the meaning of SHACL.  I recently
> pointed out one of them having to do with nodes in a shape graph that have
> rdf:type links to both sh:PropertyConstraint and
> sh:InversePropertyConstraint.
>
>
> The syntax of SHACL is not well defined.
>
> The current solution to the problems with nodes that belong to  both
> sh:PropertyConstraint and sh:InversePropertyConstraint is to make them
> illegal syntax.  However, this is quite tricky as SHACL performs several
> kinds of inference on shapes graphs.  Several partial fixes for determining
> whether a node is a legal value for sh:Property, sh:InverseProperty, or
> sh:Constraint have been proposed, but all of them have been incomplete and
> not well founded.

This is ISSUE-134. Again, we already have several threads open for that 
topic and I will get to this in due course. I don't find it helpful to 
have yet another email thread with yet more of the same here.

Overall all this just serves to give the impression that there are 
countless problems, while on closer examination each individual issue is 
quite solvable.

>
> None of these fixes have attacked the underlying problem which is that the
> syntactic category of a constraint node is partly based on rdf:type links of
> that node and partly based on how that node fits into a shape.  This split
> in syntactic determination makes for a complex, error-prone, and hard to
> understand syntax.
>
>
> There are other problems with the syntax that may not be individually
> fundamental, but together are quite significant.
>
> Lists are used in various places in the syntax.  Several constraint
> components have lists as values of their main property.  However, there is
> no definition in the document as to what make a valid list, or even any
> definition of what constitute the members of a list.

Hmmm, isn't it clear that we are talking about rdf:Lists, and then of 
course the usual rdf:List syntax from the existing specs will be used. 
Why do we need to repeat any of this in the SHACL spec? It would be like 
explaining the meaning of the various XSD datatypes...

>
> The syntax has several unnecessary restrictions.  It is not possible to
> repeat properties in constraints (but it is almost necessary to repreat
> properties in shapes).

This is ISSUE-133 for which we seem to be very close to a resolution 
(see PROPOSALS page), allowing repeated properties. With more time, we 
could have closed that issue today.

> Constraints and shapes are different, leading to
> verbose syntax, even for an RDF encoding.

This is (mostly) ISSUE-135. Merging shapes and constraints introduces 
new problems and throws things together that do not really belong together.

I assume you want to use all this (and similar) emails to make a case 
for your Proposal 4. I have enumerated several serious problems with 
that proposal, but you have not responded to them. Do you seriously 
believe that once we switch to your proposal then suddenly all issues 
will go away, and we will not discover many new problems? The current 
syntax has been around for quite a while now and many people around the 
world have worked with it. I personally have in-depth experience with 
this approach now and like it a lot. I don't see "fundamental" problems 
other than that we are progressing too slowly.

Holger
Received on Friday, 8 April 2016 00:21:17 UTC