Re: minor syntax fixes

Hi Peter,

On 22/03/2017 8:57, Peter F. Patel-Schneider wrote:
> There are several minor problems with the SHACL Core syntax that need to be
> fixed.  Most of the problems make shapes graphs illegal that should be
> legal.

"need to be fixed" is IMHO a bit strong, but I can live with "should be 
fixed".

>
>
> The syntax of paths is too restrictive as it disallows extra triples on many
> (but not all) path nodes.  This forbids comments in useful places, like in
> the SHACL property path below
>
> ex:PathShape rdf:type sh:PropertyShape ;
>   rdfs:comment "Inverse of p has to be C" ;
>   sh:path [ rdfs:comment "Inverse Path" ; sh:inversePath ex:p ] ;
>   sh:class ex:C .
>
> The wording in the ASHACL Property Paths section of
> https://arxiv.org/abs/1702.01795 permits extra triples on path nodes and
> thus provides a better definition for SHACL property paths.

There are pros and cons of such a change. Yes, theoretically someone may 
want to attach extra triples, but is this really a strong requirement? 
I'd argue it should be sufficient to put such a comment into the 
surrounding node. A downside of allowing this is that it raises the 
costs for development tools: in our own products we allow users to enter 
paths in SPARQL 1.1 surface syntax, and if we allow other triples in 
there then people may expect those to be preserved which would be 
prohibitively expensive.

As I am not convinced of the benefits and due to the late stage in the 
process (and the risk of introducing regression bugs) I would prefer to 
keep the path syntax as-is.


I do agree with most of the other rules below, as they would allow us to 
define more narrow meta-shapes than we would otherwise have to 
implement. Note all this is in the nice-to-have category, so if anyone 
sees issues here, I'd be happy to roll back.

>
>
> Some syntax checks go beyond checking syntax of information associated with
> shapes.  They should only be performed on shapes.
>
> This makes odd but harmless triples illegal, such as
>
> ex:n3 sh:severity 7 .
>
> ex:n5 sh:message 0 .
>
> ex:n4 sh:deactivated 0 .
>
> ex:n1 sh:path ex:p1 , ex:p2 .
>
> ex:n2 sh:path [ rdfs:comment "Not a path" ] .
>
> The following changes to the syntax rules will fix these problems.
>
> severity-nodeKind  Each value for sh:severity in a shape is an IRI.

Done.

>
> message-datatype  Each value for sh:message in a shape is either
>    a xsd:string literal or a literal with a language tag.

Done, but note that sh:message is also allowed for SPARQL-based 
constraints so I had to formalize a similar syntax rule there.

>    
> deactivated-datatype  Each value for sh:deactivated in a shape is
>    either true or false.

Done.

>
> path-maxCount  A shape has at most one value for sh:path.

Done.

>
> path-shape  Each value for sh:path in a shape is a well-formed
>   SHACL property path.

Done, although I have kept "must be a ..." because some users may 
interpret it otherwise as "each value of sh:path is well-formed".

>
>
> One syntax rule is missing, allowing some misleading syntax that should be
> disallowed, as in
>
> ex:s1 sh:uniqueLang true, false .
>
> The following addition to the syntax rules will fix this problem.
>
> uniqueLang-maxCount  A shape has at most one value for sh:uniqueLang.

Done, specifically for property shapes (node shapes cannot have these 
values anyway). Note that there are several other constraint components 
that really only should have one value, e.g. sh:datatype. I would argue 
that there is just a small number of the core components where multiple 
values make sense (sh:class, sh:property and sh:node and the logical 
operators). So for now I have added similar maxCount=1 rules to

- sh:datatype
- sh:nodeKind
- sh:minCount
- sh:maxCount
- sh:min/max/in/exclusive
- sh:minLength
- sh:maxLength
- sh:languageIn
- sh:in

Background is that I expect many people to try to use multiple 
sh:datatype values to express "or" (already happened). By having those 
restrictions in place, tools can help users avoid these mistakes.

I will have this discussed in tonight's WG meeting too to double-check 
if anyone sees problems with this change. Meanwhile if anyone sees 
problems (real use cases that are no longer allowed) with the changes 
above, let me know ASAP.

>
>
> Many syntax rules state that they are for any node but it turns out that
> they can be stated for shapes only without making any change in SHACL
> syntax.  Changing these rules to be for shapes results in a more natural set
> of rules.  The rules in quesstion look like
>    Each value of XXX is ...
> or
>    The values of XXX are ...
> where XXX is a parameter or a property related to targetting.  They can be
> changed to
>    Each value for XXX in a shape is ...

All done (hopefully), see

https://github.com/w3c/data-shapes/commit/d92aaa7b6fd2e7a196e01fec8f19be7d385a5ae1

Would appreciate double-checking as we are trying to move to CR soon.

Thanks,
Holger

Received on Wednesday, 22 March 2017 01:12:36 UTC