Re: ISSUE-135: Proposed changes to implement syntax simplification from Holger Knublauch on 2016-05-10 (public-data-shapes-wg@w3.org from May 2016)

From: Holger Knublauch <holger@topquadrant.com>
Date: Wed, 11 May 2016 08:49:17 +1000
To: "public-data-shapes-wg@w3.org" <public-data-shapes-wg@w3.org>
Message-ID: <3bc41012-1fca-6976-12fb-b02549c843b5@topquadrant.com>
See also the current ISSUE-133 thread. I had enumerated various reasons 
why I would object to merging constraints and shapes in other threads.

A suggestion that I haven't seen yet is that technically both worlds 
could be brought closer together by stating

     sh:Shape rdfs:subClassOf sh:NodeConstraint .

But the situation is similar in OWL, where people can either say

ex:Class
     a owl:Class ;
     owl:oneOf ( ... ) .

or

ex:Class
     a owl:Class ;
     owl:equivalentClass [
         owl:oneOf ( ... ).
     ] .

As someone who worked on various OWL related tools over the years, I can 
report that having too many syntaxes for the same thing is causing a lot 
of pain. In the case of shapes vs constraints the situation gets worse 
because these are really different concepts, with different properties. 
For example why should (Node)Constraints have sh:property again, while 
sh:name only applies to PropertyConstraints.

Given that we are merely talking about some syntactic sugar in 
hand-written Turtle files for certain operators such as sh:or, I 
actually believe it would be wiser to not offer all these options, and 
continue to require the "striping" approach that we have in the current 
spec. Yes, this makes the Turtle a bit longer, but there are tools and 
other compact syntaxes that could easily address this.

For sh:or the main thing to support was that it should also be 
applicable to property constraints so that people don't need to repeat 
the sh:predicate, and this is making the previous syntax already better, 
from:

ex:MyShape
     a sh:Shape ;
     sh:constraint [
         sh:or (
             [ a sh:Shape ;
                 sh:property [
                     sh:predicate schema:address ;
                     sh:datatype xsd:string ;
                 ] ;
             ]
             [ a sh:Shape ;
                 sh:property [
                     sh:predicate schema:address ;
                     sh:class schema:Address ;
                 ] ;
             ]
         )
     ]

to

ex:MyShape
     a sh:Shape ;
     sh:property [
 sh:predicate schema:address ;
 sh:or (
             [
  sh:constraint [
      sh:datatype xsd:string ;
                 ] ;
             ]
             [
                 sh:constraint [
                     sh:class schema:Address ;
                 ] ;
             ]
         )
     ]

I believe that would be an acceptable syntax while all presented 
alternatives have their individual drawbacks. (And we'll probably have 
sh:typeIn for the specific case above.) With this I'd like to withdraw 
this proposal here and take baby steps for now, generalizing sh:or to 
also apply for property constraints.

Holger



On 11/05/2016 6:27, Peter F. Patel-Schneider wrote:
> Why not also allow any component that is currently allowed in node constraints
> to occur in shapes?
>
> This permits
>
> ex:MyShape a sh:Shape ;
>          sh:or (
>              [ a sh:NodeConstraint ;
>                      sh:datatype xsd:string ;
>                  ] ;
>              ]
>              [ a sh:NodeConstraint ;
>                      sh:class schema:Address ;
>                  ]
>          ) .
>
> instead of the current
>
> ex:MyShape a sh:Shape ;
>      sh:constraint [
>          sh:or (
>              [ a sh:Shape ;
>                  sh:constraint [
>                      a sh:NodeConstraint ;
>                      sh:datatype xsd:string ;
>                  ] ;
>              ]
>              [ a sh:Shape ;
>                  sh:constraint [
>                      a sh:NodeConstraint ;
>                      sh:class schema:Address ;
>                  ]
>              ]
>          )
>      ] .
>
> peter
>
>
> On 05/08/2016 11:16 PM, Holger Knublauch wrote:
>> I have meanwhile had a bit more time for this topic and here is my updated
>> proposal (which is also much simpler than before).
>>
>> Several WG members supported the idea of allowing constraints to be used as
>> values in places such as sh:or. I was asked to make some specific suggestions
>> on what would need to be changed in the spec, so that the following syntax
>> options would behave identically. (Both scenarios state that the values of
>> schema:address must be string literals or instances of schema:Address):
>>
>> a) Currently supported: sh:or can only be used with sh:NodeConstraints and
>> operands of sh:or must be shapes
>>
>> ex:MyShape
>>      a sh:Shape ;
>>      sh:constraint [
>>          sh:or (
>>              [ a sh:Shape ;
>>                  sh:property [
>>                      sh:predicate schema:address ;
>>                      sh:datatype xsd:string ;
>>                  ] ;
>>              ]
>>              [ a sh:Shape ;
>>                  sh:property [
>>                      sh:predicate schema:address ;
>>                      sh:class schema:Address ;
>>                  ] ;
>>              ]
>>          )
>>      ]
>>
>> which lacks on multiple fronts - it is too verbose and also forces repetition
>> of the predicate.
>>
>> b) Proposed: generalize sh:or and values of sh:or may be sh:NodeConstraints:
>>
>> ex:MyShape
>>      a sh:Shape ;
>>      sh:property [
>>          sh:predicate schema:address ;
>>          sh:or (
>>              [ sh:datatype xsd:string ]
>>              [ sh:class schema:Address ]
>>          )
>>      ]
>>
>> In this proposal, the members of the sh:or List may be sh:NodeConstraints or
>> sh:Shapes.
>>
>> Required changes (all incremental to current spec):
>>
>> 1) Rename sh:hasShape from sh:hasShape(?node, ?shape, ?shapesGraph) to
>>
>>      sh:validateNode(?node, ?shapeOrConstraint, ?shapesGraph)
>>
>> The algorithm would be changed to
>>
>> a) if ?shapeOrConstraint rdf:type sh:Shape, then behave as currently
>> b) otherwise, assume rdf:type sh:NodeConstraint
>> c) report failure if the node has rdf:type that is neither sh:Shape nor
>> sh:NodeConstraint.
>>
>> The name sh:validateNode is better than sh:hasShape because it may also return
>> unbound. I have no strong opinion whether we should assume sh:Shape or
>> sh:NodeConstraint as default here - it really depends on what case we consider
>> more frequent and what kind of syntactic sugar we want to provide.
>>
>> 2) Generalize sh:or to also have contexts: sh:PropertyConstraint and
>> sh:InversePropertyConstraint. The validators are almost identical to the
>> current one, simply calling sh:validateNode on each value.
>>
>> The same approach would work for sh:and and sh:not. I guess also for
>> sh:valueShape if that's desirable.
>>
>> Regards,
>> Holger
>>
>>
>> On 5/05/2016 22:21, Holger Knublauch wrote:
>>> Too quick: I believe there is a glitch in the algorithm below and I need to
>>> think more about the implementation details. As stated it would walk the
>>> properties of a property value, which is incorrect. Maybe the list values
>>> need to be interpreted as sh:NodeConstraints only. Please ignore for now.
>>>
>>> Holger
>>>
>>>
>>> On 5/05/2016 12:49, Holger Knublauch wrote:
>>>> Several WG members supported the idea of allowing constraints to be used as
>>>> values in places such as sh:or. I was asked to make some specific
>>>> suggestions on what would need to be changed in the spec, so that the
>>>> following syntax options would behave identically. (Both scenarios state
>>>> that the values of schema:address must be string literals or instances of
>>>> schema:Address):
>>>>
>>>> a) Currently supported: sh:or can only be used with sh:NodeConstraints and
>>>> operands of sh:or must be shapes
>>>>
>>>> ex:MyShape
>>>>      a sh:Shape ;
>>>>      sh:constraint [
>>>>          sh:or (
>>>>              [ a sh:Shape ;
>>>>                  sh:property [
>>>>                      sh:predicate schema:address ;
>>>>                      sh:datatype xsd:string ;
>>>>                  ] ;
>>>>              ]
>>>>              [ a sh:Shape ;
>>>>                  sh:property [
>>>>                      sh:predicate schema:address ;
>>>>                      sh:class schema:Address ;
>>>>                  ] ;
>>>>              ]
>>>>          )
>>>>      ]
>>>>
>>>> which lacks on multiple fronts - it is too verbose and also forces
>>>> repetition of the predicate.
>>>>
>>>> b) Proposed: generalize sh:or and values of sh:or may be constraints of the
>>>> same kind as the surrounding constraint.
>>>>
>>>> ex:MyShape
>>>>      a sh:Shape ;
>>>>      sh:property [
>>>>          sh:predicate schema:address ;
>>>>          sh:or (
>>>>              [ sh:datatype xsd:string ]
>>>>              [ sh:class schema:Address ]
>>>>          )
>>>>      ]
>>>>
>>>> In this proposal, the members of the sh:or List may be
>>>> sh:PropertyConstraints if sh:or is used within a sh:PropertyConstraint.
>>>>
>>>> Required changes (all incremental to current spec):
>>>>
>>>> 1) Generalize sh:hasShape from sh:hasShape(?node, ?shape, ?shapesGraph) to
>>>>
>>>>      sh:validateNode(?node, ?shapeOrConstraint, ?shapesGraph,
>>>> ?defaultConstraintType, ?defaultPredicate)
>>>>
>>>> The two arguments at the end are optional, and are used to complement the
>>>> provided ?shapeOrConstraint unless it is a sh:Shape. Legal values for
>>>> ?defaultConstraintType would be sh:PropertyConstraint,
>>>> sh:InversePropertyConstraint and sh:NodeConstraint. ?defaultPredicate is
>>>> only supported if ?defaultConstraintType is given and != sh:NodeConstraint.
>>>>
>>>> The algorithm would be
>>>>
>>>> a) if ?shapeOrConstraint rdf:type sh:Shape, then behave as currently
>>>> b) otherwise, assume ?defaultConstraintType (unless the node has an rdf:type)
>>>>      and assume ?defaultPredicate for sh:predicate.
>>>> c) report failure if the node has rdf:type that is neither sh:Shape nor
>>>> ?defaultConstraintType.
>>>>
>>>> While this function isn't pretty it's mostly used internally anyway and may
>>>> therefore be regarded as an implementation detail. The name sh:validateNode
>>>> is better than sh:hasShape because it may also return unbound.
>>>>
>>>> 2) Generalize sh:or to also have contexts: sh:PropertyConstraint and
>>>> sh:InversePropertyConstraint
>>>>
>>>> 3) Add a sh:propertyValidator to sh:OrConstraint similar to what we have as
>>>> sh:nodeValidator, but with the sh:validateNode function:
>>>>
>>>> SELECT $this ?failure ...
>>>> WHERE {
>>>>    {
>>>>   $this $predicate ?value .
>>>>  }
>>>>   {
>>>>   SELECT (SUM(?s) AS ?count)
>>>>   WHERE {
>>>>    GRAPH $shapesGraph {
>>>>     $or rdf:rest*/rdf:first ?shape .
>>>>    }
>>>>    BIND (sh:validateNode(?value, ?shape, $shapesGraph, sh:PropertyConstraint, $predicate) AS ?valid) .
>>>>    BIND (IF(bound(?valid), IF(?valid, 1, 0), 'error') AS ?s) .
>>>>   }
>>>>  }
>>>>  BIND (!bound(?count) AS ?failure) .
>>>>  FILTER IF(?failure, true, ?count = 0) .
>>>> }
>>>>
>>>> and similar for sh:inversePropertyValidator. The same approach would work
>>>> for sh:and and sh:not. I guess also for sh:valueShape if that's desirable.
>>>>
>>>> Regards,
>>>> Holger
>>>>
Received on Tuesday, 10 May 2016 23:15:23 UTC