Re: comments on current version of SHACL document

On Thu, Sep 17, 2015 at 5:38 AM, Holger Knublauch <holger@topquadrant.com>
wrote:

> On 9/17/2015 1:28, Peter F. Patel-Schneider wrote:
>
>> I took a quick look at the version of the document current on 15
>> September,
>> concentrating mostly on Sections 1-6.
>>
>> The document is looking better, but there are still several significant
>> problems.
>>
>
> Thanks, see my responses below.
>
> As an aside, I find it unhelpful to read hyperboles such as "significant
> problems" for things that are relatively easy to fix. We are currently just
> talking about a First Public Working Draft, not a 100% perfect
> specification.
>
>
>> - With the new emphasis on SPARQL, there should be a part of Section 1
>> that
>>    introduces the use of SPARQL as the definition of SHACL.  This would
>>    include some of the stuff from the beginning of Section 3.
>>
>
> Done.
>
>    There needs to
>>    be more information on how SPARQL is used in the definition of SHACL in
>>    the discussion that is currently at the beginning of Section 3, such as
>>    how the results of the queries are combined.  This would also discuss
>> the
>>    problem with blank nodes.
>>
>
> I have attempted to formulate a paragraph on blank nodes, but marked it
> with a red TODO because the wording may not use the correct terminology. I
> would appreciate input (I believe Eric may have the right references here).
>
>    As well, sh:hasShape needs to be better described.
>>
>
> I added a bit on that; not sure what else is missing. The key feature of
> that definition is the reference to the validateNodeAgainstShape operation,
> which would describe the details.
>
>    Several SPARQL queries provided require that the shapes graph
>>    be accessible.  As this is not guaranteed, there needs to be an
>>    explanation of what is going on.  It would also be better to have
>> SPARQL
>>    definitions for more of SHACL, such as scopes.  (This would require
>> moving
>>    the details of using SPARQL to define SHACL earlier in the document.)
>>
>
> All done, I believe.
>
>
>> - The handling of results is poorly defined.  There is no discussion of
>> how
>>    the results of embedded constraints and shapes are to be handled.  This
>>    needs to be cleaned up before FPWD publication.
>>
>
> I have added statements that clarify that these nested results are just
> temporary.
>
>
>> - With the user-friendly syntax, shapes do not necessarily need to be in
>> an
>>    RDF graph.  I think that this means that the early part of the document
>>    should not say "shapes graph", but instead something like "shapes".
>> Then
>>    the document can say "SHACL shapes are often encoded in an RDF graph,
>>    which is called the shapes graph."  Then later on it can say "shapes
>>    encoded as an RDF graph" where necessary.  I don't know what should be
>>    done for constructs that are not going to be part of the user-friendly
>>    syntax.
>>
>
> This is not my understanding of how SHACL works. I believe the SHACL spec
> always assumes that the shapes are represented in RDF, and in a dedicated
> shapes graph, using exactly the specified vocabulary. If someone wants to
> use another (compact) syntax then these syntaxes need to be translated into
> RDF triples prior to execution.
>
>
>> - SHACL is not an RDF vocabulary.  It is a language that can be encoded in
>>    RDF, and that uses a particular vocabulary in the encoding.  Any time
>>    SHACL shapes are discussed as being part of an RDF graph, careful
>>    attention needs to be paid ot the wording used so as to not give
>> incorrect
>>    information.
>>
>
> My understanding of SHACL is that SHACL *defines* an RDF vocabulary. Our
> whole spec references IRIs from that vocabulary such as sh:minCount. I have
> tried to make it clear in previous edits that SHACL is not just a
> vocabulary, and would appreciate specific pointers if you believe this is
> still misleading.
>
>
>> - What happens with cyclic shapes references that do not involve a
>> property
>>    constraint?  Are these handled the same as cyclic references that do
>>    involve a property constraint?
>>
>
> Yes, the same way. The recursion stops if it encounters the same
> shape/focusNode combination. Having said this, the handling of
> sh:valueShape differs from the others such as sh:AndConstraint: they pass
> in another argument to the recursionIsError argument. The effect of this is
> that sh:valueShape will handle recursion as "true" while others will handle
> them as "failure".
>
>
>> - All uses of RDFS notions, e.g., subclasses, should be qualified, e.g.,
>>    SHACL subclasses.
>>
>
> To me this feels a bit pedantic. Isn't this handled by section 1.1 (which
> you wrote), i.e. we do a broad statement in the beginning so that we don't
> need to repeat the same things over and over again?
>
>
>> - The relationship between shapes and constraints is poorly explained.  A
>>    shape has a bunch of constraints, which together serve to define the
>>    shape.  Constraints are not just validated against the same focus
>> nodes.
>>
>
> I have tried to improve the wording.
>
>
>> - Most of the document is about the definition of SHACL.  There is little
>> or
>>    no need for MUST, etc., in this definition.  Where MUST, etc., should
>> show
>>    up is when describing the behaviour of SHACL implementations.  For
>>    example, a good description of scope combination would be "The scopes
>> of a
>>    SHACL shape are considered additively, so, for example, in a shape with
>>    two individual scopes both individuals are in the scope of the shape."
>>    with no MUST, etc., needed.
>>
>
> Ok, I didn't know that (and wonder why MUST exists at all). I have now
> tried to limit MUST to sentences about the implementations.
>
>
>> - The description of the various bits of property constraints obscures the
>>    independence of some of the bits.  For example, splitting sh:minCount
>> and
>>    sh:maxCount would eliminate talk about defaults.
>>
>
> I mainly mentioned the default value of sh:minCount to clarify it, because
> we had discussed in the past that ShEx uses 1 as default. min and maxCount
> are grouped together because the computation of counts is a potentially
> expensive operation and shouldn't be done twice. You might say this is an
> implementation detail, but in order to be able to apply this optimization,
> they need to be in the same template and thus SPARQL query. I think there
> should be strong reasons if we want to deviate from the template mapping.
> Eliminating the talk about defaults doesn't strike me as a strong reason.
> (Another small difference is that the number of results may be different if
> we split them up. Right now there will always only be one constraint
> violation, even if both are violated, due to a modeling error).


If the grouping here is meant to enforce it on the implementations then I
am also not in favor of it.
The most common uses cases (0,1, no value) can be translated to a lot less
complex queries and greater optimization can be achieved in those cases



>
>
>
>> There are few things that would make the document better, but that are
>> certainly not needed immediately.
>>
>> - It would be nice to have an option to hide the SPARQL definitions.
>>
>
> Yep, will do at some stage.
>
>
>> - It might be better to turn Section 2.3 into the beginning of Section 3.
>>
>
> This doesn't make sense to me because 3 is only about the Core constraint
> types, while 2.3 is about the general mechanism (including a brief
> reference to the extension mechanism). Also, how can we have a section on
> Shapes without mentioning Constraints.
>
> Details of my recent edits are here:
>
>
> https://github.com/w3c/data-shapes/commit/607e758fbf6f633ab36fcdb4d59df01ccdad1699
>
> Thanks again,
> Holger
>
>
>


-- 
Dimitris Kontokostas
Department of Computer Science, University of Leipzig & DBpedia Association
Projects: http://dbpedia.org, http://http://aligned-project.eu,
http://rdfunit.aksw.org
Homepage:http://aksw.org/DimitrisKontokostas
Research Group: http://aksw.org

Received on Friday, 18 September 2015 08:44:18 UTC