Re: comments on current version of SHACL document from Peter F. Patel-Schneider on 2015-09-19 (public-data-shapes-wg@w3.org from September 2015)

From: Peter F. Patel-Schneider <pfpschneider@gmail.com>
Date: Fri, 18 Sep 2015 19:56:44 -0700
To: Holger Knublauch <holger@topquadrant.com>, public-data-shapes-wg@w3.org
Message-ID: <55FCCEEC.9080508@gmail.com>
On 9/17/2015 1:28, Peter F. Patel-Schneider wrote:
> > I took a quick look at the version of the document current on 15 September,
> > concentrating mostly on Sections 1-6.
> >
> > The document is looking better, but there are still several significant
> > problems.
>
> Thanks, see my responses below.
>
> As an aside, I find it unhelpful to read hyperboles such as "significant
> problems" for things that are relatively easy to fix. We are currently just
> talking about a First Public Working Draft, not a 100% perfect
> specification.

I view several of the problems that I found to be significant.  You can
protest all you want, but don't expect me to stop using words that are
neither hyperbolic nor inflammatory.

> >
> > - With the new emphasis on SPARQL, there should be a part of Section 1 that
> >    introduces the use of SPARQL as the definition of SHACL.  This would
> >    include some of the stuff from the beginning of Section 3.
>
> Done.
>
> >    There needs to
> >    be more information on how SPARQL is used in the definition of SHACL in
> >    the discussion that is currently at the beginning of Section 3, such as
> >    how the results of the queries are combined.  This would also discuss the
> >    problem with blank nodes.
>
> I have attempted to formulate a paragraph on blank nodes, but marked it
> >with a red TODO because the wording may not use the correct
> >terminology. I would appreciate input (I believe Eric may have the right
> >references here).

Although more needs to be done here, I think that it is acceptable for now.

> >    As well, sh:hasShape needs to be better described.
>
> I added a bit on that; not sure what else is missing. The key feature of
> >that definition is the reference to the validateNodeAgainstShape
> >operation, which would describe the details.

This doesn't work, as validateNodeAgainstShape depends on sh:hasShape.

As well, the recursion as error handling doesn't appear to be specified in a
reasonable manner.   One of the calls doesn't even specify a value for this
variable.  The calls for and and or set it naively, without regards for
whether there is a negated loop.

> >    Several SPARQL queries provided require that the shapes graph
> >    be accessible.  As this is not guaranteed, there needs to be an
> >    explanation of what is going on.  It would also be better to have SPARQL
> >    definitions for more of SHACL, such as scopes.  (This would require moving
> >    the details of using SPARQL to define SHACL earlier in the document.)
>
> All done, I believe.
>
> >
> > - The handling of results is poorly defined.  There is no discussion of how
> >    the results of embedded constraints and shapes are to be handled.  This
> >    needs to be cleaned up before FPWD publication.
>
> I have added statements that clarify that these nested results are just
> temporary.

I think that much more needs to be done to introduce result handling.

> > - With the user-friendly syntax, shapes do not necessarily need to be in an
> >    RDF graph.  I think that this means that the early part of the document
> >    should not say "shapes graph", but instead something like "shapes".  Then
> >    the document can say "SHACL shapes are often encoded in an RDF graph,
> >    which is called the shapes graph."  Then later on it can say "shapes
> >    encoded as an RDF graph" where necessary.  I don't know what should be
> >    done for constructs that are not going to be part of the user-friendly
> >    syntax.
>
> This is not my understanding of how SHACL works. I believe the SHACL spec
> always assumes that the shapes are represented in RDF, and in a
> dedicated shapes graph, using exactly the specified vocabulary. If
> someone wants to use another (compact) syntax then these syntaxes need
> to be translated into RDF triples prior to execution.

This is an unnecessary step.  It should not be required.

> > - SHACL is not an RDF vocabulary.  It is a language that can be encoded in
> >    RDF, and that uses a particular vocabulary in the encoding.  Any time
> >    SHACL shapes are discussed as being part of an RDF graph, careful
> >    attention needs to be paid ot the wording used so as to not give incorrect
> >    information.

> My understanding of SHACL is that SHACL *defines* an RDF vocabulary. Our
> whole spec references IRIs from that vocabulary such as sh:minCount. I
> have tried to make it clear in previous edits that SHACL is not just a
> vocabulary, and would appreciate specific pointers if you believe this is
> still misleading.

I will do another pass on this after FPWD publication.

> > - What happens with cyclic shapes references that do not involve a property
> >    constraint?  Are these handled the same as cyclic references that do
> >    involve a property constraint?
>
> Yes, the same way. The recursion stops if it encounters the same
> shape/focusNode combination. Having said this, the handling of
> sh:valueShape differs from the others such as sh:AndConstraint: they pass
> in another argument to the recursionIsError argument. The effect of this
> is that sh:valueShape will handle recursion as "true" while others will
> handle them as "failure".

So recursion inside an and is a failure?  This does not seem to be
reasonable in general.

> > - All uses of RDFS notions, e.g., subclasses, should be qualified, e.g.,
> >    SHACL subclasses.
>
> To me this feels a bit pedantic. Isn't this handled by section 1.1 (which
> you wrote), i.e. we do a broad statement in the beginning so that we don't
> need to repeat the same things over and over again?

There are going to be many readers of the document that are coming from an
RDF and RDFS background.  I think that the document needs to hammer in the
differences so that such people do not get the wrong idea now and complain
bitterly later.

> > - The relationship between shapes and constraints is poorly explained.  A
> >    shape has a bunch of constraints, which together serve to define the
> >    shape.  Constraints are not just validated against the same focus nodes.
>
> I have tried to improve the wording.

I think that the problem is still that the way that SHACL is put together is
difficult to describe, and comprehend.  For example, shape scopes are used
sometimes when shapes are considered but not at other times.  A cleaner
design would be easier to describe.

> > - Most of the document is about the definition of SHACL.  There is little or
> >    no need for MUST, etc., in this definition.  Where MUST, etc., should show
> >    up is when describing the behaviour of SHACL implementations.  For
> >    example, a good description of scope combination would be "The scopes of a
> >    SHACL shape are considered additively, so, for example, in a shape with
> >    two individual scopes both individuals are in the scope of the shape."
> >    with no MUST, etc., needed.
>
> Ok, I didn't know that (and wonder why MUST exists at all). I have now
> tried to limit MUST to sentences about the implementations.

Things are better now.  It is guaranteed that more wordsmithing will be
needed down the line.

> >
> > - The description of the various bits of property constraints obscures the
> >    independence of some of the bits.  For example, splitting sh:minCount and
> >    sh:maxCount would eliminate talk about defaults.

> I mainly mentioned the default value of sh:minCount to clarify it,
> because we had discussed in the past that ShEx uses 1 as default. min and
> maxCount are grouped together because the computation of counts is a
> potentially expensive operation and shouldn't be done twice. You might say
> this is an implementation detail, but in order to be able to apply this
> optimization, they need to be in the same template and thus SPARQL
> query. I think there should be strong reasons if we want to deviate from
> the template mapping. Eliminating the talk about defaults doesn't strike
> me as a strong reason. (Another small difference is that the number of
> results may be different if we split them up. Right now there will always
> only be one constraint violation, even if both are violated, due to a
> modeling error).

> > There are few things that would make the document better, but that are
> > certainly not needed immediately.
> >
> > - It would be nice to have an option to hide the SPARQL definitions.
>
> Yep, will do at some stage.
>
> >
> > - It might be better to turn Section 2.3 into the beginning of Section 3.

>
> This doesn't make sense to me because 3 is only about the Core constraint
> types, while 2.3 is about the general mechanism (including a brief
> reference to the extension mechanism). Also, how can we have a section on
> Shapes without mentioning Constraints.

>
> Details of my recent edits are here:
>
>
https://github.com/w3c/data-shapes/commit/607e758fbf6f633ab36fcdb4d59df01ccdad1699
>
> Thanks again,
> Holger
Received on Saturday, 19 September 2015 02:57:18 UTC