Re: formal objection to SHACL property path syntax

On 3/05/2017 12:25, Sandro Hawke wrote:
>
>
> On 05/02/2017 08:54 PM, Holger Knublauch wrote:
>>
>>
>> On 3/05/2017 10:42, Irene Polikoff wrote:
>>> I think the main issue here is one of timing.
>>>
>>> If we are to make the change Peter is asking for, we will need to 
>>> re-do the CR. I don’t know how quickly this could happen. I assume 
>>> this means another transition meeting? Then, there will be a need 
>>> for updates to implementations, new tests, etc.
>>>
>
> If it's just a change to clarify, including to state something 
> explicitly that was implicit before, that wouldn't generally need a 
> new CR.
>
> I can't claim to understand this issue, but it seems reasonable to 
> assume all uses of the rdf list vocabulary in a shapes graph would be 
> in well-formed lists, ie what you can get with Turtle's "collections" 
> syntax; that is, (a b c).  If that's been the assumption, then I think 
> we could make that explicit without needing a new CR.

No, the SHACL spec is well-behaved and well-defined on this issue, even 
includes its own definition of well-formed lists (aka SHACL lists). The 
"issue" that is raised here is that the RDF nodes of such lists may have 
other triples (such as an rdfs:comment or, as shown, sh:inversePath) and 
then a human reader may get confused about what is being meant, or an 
implementation reveals a programming error and accidentally interprets 
this as an inverse path. But programming errors are now excluded because 
this scenario is covered by two test cases. And for human 
misunderstandings, well there is any number of similar examples that 
someone mean could construct to mislead humans. Yet the chances of ever 
seeing such a scenario outside of the sometimes bizarre world of W3C 
processes is extremely small.

>   (Sadly, the definition of well-formed list didn't quite make it into 
> RDF 1.1 "as the WG is out of time".  See 
> https://www.w3.org/2011/rdf-wg/track/issues/102. If you read the last 
> email on the topic there, you'll see why we ran out of time.)

LOL! Yes that's an instant classic. Nice summary of the problems of the 
semantic web from the last decade.

Holger


>
>
> It also seems okay to me not to worry about, since no one else in RDF 
> land does, as far as I know.   Maybe OWL does.
>
>     -- Sandro
>
>>> It seems to me that we do not have time for it.
>>>
>>> Essentially, the spec (potentially) made one small slip.
>>
>> I wouldn't even call this a slip. Producing an rdf:List that has 
>> other triples than rdf:first and rdf:rest is IMHO an entirely 
>> theoretical corner case. The same could be argued in other places 
>> where rdf:Lists are used, e.g. someone could say that
>>
>> ex:MyClass
>>     a owl:Class ;
>>     owl:intersectionOf [
>>         rdf:first ex:OtherClass ;
>>         rdf:rest rdf:nil ;
>>         owl:sameAs rdf:nil ;
>>     ] .
>>
>> is potentially "misleading" and may confuse implementations or 
>> readers. Yet I don't believe anyone would consider such scenarios 
>> worth a *formal objection* as lists are typically either constructed 
>> in a Turtle/JSON/RDF/XML syntax or by interactive tools that also 
>> prohibit such extra triples.
>>
>> Holger
>>
>>
>>
>>> It may not have said clearly enough that the sequence path can only 
>>> have list members and nothing else. All other paths say “exactly one 
>>> triple”, so there is no issue. Here, we are specifying two triples, 
>>> but we do not say “exactly two”.
>>>
>>> Sequence path is defined as follows:
>>>
>>>     A sequence path is a blank node
>>>     <https://www.w3.org/TR/shacl/#dfn-blank-node> that is a SHACL
>>>     list <https://www.w3.org/TR/shacl/#dfn-shacl-list> with at least
>>>     two members <https://www.w3.org/TR/shacl/#dfn-members> and each
>>>     member is a well-formed
>>>     <https://www.w3.org/TR/shacl/#dfn-well-formed> SHACL property path.
>>>
>>>
>>>     SHACL Lists
>>>     A SHACL list in an RDF graph |G| is an IRI
>>>     <https://www.w3.org/TR/shacl/#dfn-iri> or a blank node
>>>     <https://www.w3.org/TR/shacl/#dfn-blank-node> that is either
>>>     |rdf:nil| (provided that |rdf:nil |has no value
>>>     <https://www.w3.org/TR/shacl/#dfn-value> for either
>>>     |rdf:first| or |rdf:rest|), or has exactly one value
>>>     <https://www.w3.org/TR/shacl/#dfn-value> for the property
>>>     |rdf:first| in |G |and exactly one value
>>>     <https://www.w3.org/TR/shacl/#dfn-value> for the property
>>>     |rdf:rest| in |G| that is also a SHACL list in |G|, and the list
>>>     does not have itself as a value of the property path
>>>     |rdf:rest+| in |G|.
>>>      The members of any SHACL list except |rdf:nil| in an RDF graph
>>>     |G| consist of its value for |rdf:first| in |G|followed by the
>>>     members in |G| of its value for |rdf:rest| in |G|. The SHACL
>>>     list |rdf:nil| has no members in any RDF graph.
>>>
>>>
>>> So, something like the following would clarify things:
>>>
>>>     A sequence path is a blank node
>>>     <https://www.w3.org/TR/shacl/#dfn-blank-node> that is a SHACL
>>>     list <https://www.w3.org/TR/shacl/#dfn-shacl-list> with at least
>>>     two members <https://www.w3.org/TR/shacl/#dfn-members> and each
>>>     member is a well-formed
>>>     <https://www.w3.org/TR/shacl/#dfn-well-formed> SHACL property
>>>     path. A sequence path is the subject of exactly two triples in |G|.
>>>
>>>
>>> I wonder if there is any way to clarify this and position it as an 
>>> editorial change. I hope that this is arguable because of the 
>>> “symmetry”. All other paths explicitly exclude any “extraneous” 
>>> triples.
>>>
>>> What did the implementations do? How have they understood the spec?
>>>
>>> I think that this (ensuring that sequence paths only have list 
>>> members) is also something that could be potentially detected by 
>>> SHACL-SHACL.
>>>
>>> This would continue to disallow comments and other extraneous 
>>> triples in the paths, but I think it is perfectly OK. We don’t have 
>>> a requirement for supporting comments as part of the path 
>>> expressions. Instead, a comment could be associated directly with 
>>> the property shape itself.
>>>
>>>
>>>> On May 1, 2017, at 10:27 PM, Holger Knublauch 
>>>> <holger@topquadrant.com <mailto:holger@topquadrant.com>> wrote:
>>>>
>>>> Hi WG,
>>>>
>>>> I see in principle no technical problems with the changes that 
>>>> Peter suggests. I do however see serious process issues here. The 
>>>> path syntax has been in its current shape for several months, and 
>>>> he could have suggested changes earlier. Any change to such 
>>>> definitions now may introduce other regression bugs for which we 
>>>> are obviously out of time. More importantly, the motivations for 
>>>> the changes appear extremely weak to me, e.g. who has ever seen an 
>>>> rdf:List node that also has other triples than rdf:first and 
>>>> rdf:rest?! There are plenty of other ways of producing "misleading" 
>>>> shapes, including
>>>>
>>>> ex:s2 a sh:PropertyShape ;
>>>>   sh:targetNode ex:i ;
>>>>   sh:path [ rdfs:comment "zero or more ex:p" ;
>>>>             sh:inversePath ex:p ] ;
>>>>   sh:class ex:C .
>>>>
>>>> or
>>>>
>>>> ex:s2 a sh:NodeShape ;
>>>>   sh:targetNode ex:i ;
>>>>   sh:path [ rdfs:comment "inverse of ex:p" ;
>>>>             sh:inversePath ex:p ] ;
>>>>   sh:class ex:C .
>>>>
>>>> This all looks artificially constructed. Where to stop?! Given that 
>>>> this would be a formal change, we'd also need to publish a new CR.
>>>>
>>>> Holger
>>>>
>>>>
>>>>
>>>> -------- Forwarded Message --------
>>>> Subject:  formal objection to SHACL property path syntax
>>>> Resent-Date:  Mon, 01 May 2017 15:56:53 +0000
>>>> Resent-From:  public-rdf-shapes@w3.org
>>>> Date:  Mon, 1 May 2017 08:56:15 -0700
>>>> From:  Peter F. Patel-Schneider <pfpschneider@gmail.com>
>>>> To:  public-rdf-shapes@w3.org
>>>>
>>>>
>>>>
>>>> This is a formal objection to two aspects of the syntax of SHACL property
>>>> paths.
>>>>
>>>> The syntax for property paths in SHACL is both too liberal and too strict.
>>>> This leads to shapes that are unexpectedly well formed or not well formed.
>>>> Both of these are significant problems for users when writing and reading
>>>> SHACL shapes.  Interoperability problems will also arise from the too-strict
>>>> part of the syntax for property paths because the behaviour of SHACL
>>>> implementations is undefined for shapes that are not well formed,
>>>>
>>>>
>>>> On the too-liberal side, paths that are lists can also look like other kinds
>>>> of paths.  Here are two examples of well-formed SHACL shapes that should
>>>> instead not be well-formed.
>>>>
>>>> ex:s1 a sh:PropertyShape ;
>>>>    sh:targetNode ex:i ;
>>>>    sh:path [ rdf:first ex:p ; rdf:rest [ rdf:first ex:q ; rdf:rest rdf:nil ] ;
>>>>         sh:inversePath ex:q ] ;
>>>>    sh:class ex:C .
>>>>
>>>> ex:s2 a sh:PropertyShape ;
>>>>    sh:targetNode ex:i ;
>>>>    sh:path [ rdf:first ex:p ; rdf:rest [ rdf:first ex:q ; rdf:rest rdf:nil ] ;
>>>>         sh:inversePath ( ex:p ) ] ;
>>>>    sh:class ex:C .
>>>>
>>>> The first path looks ambiguous to users, being either a sequence path or an
>>>> inverse path.  Users will be confused about the correct meaning of this
>>>> path.
>>>>
>>>> The second path looks as if is is not well-formed because it contains a
>>>> sequence path that is too short.  Users will again be confused because they
>>>> will expect the path not to be well-formed when it is.
>>>>
>>>>
>>>> On the too-strict side, many kinds of paths cannot be the subject of extra
>>>> triples, even triples with predicates like rdfs:comment.  For example, the
>>>> following shape contains a path that is not well-formed.
>>>>
>>>> ex:s2 a sh:PropertyShape ;
>>>>    sh:targetNode ex:i ;
>>>>    sh:path [ rdfs:comment "inverse of ex:p" ;
>>>>           sh:inversePath ex:p ] ;
>>>>    sh:class ex:C .
>>>>
>>>> Users can easily write paths like the one above and will expect shapes
>>>> containing paths like these to have a well-defined meaning in SHACL.
>>>> However, the meaning of the above shape is undefined in SHACL and different
>>>> SHACL implementations can produce different results for these shapes without
>>>> signalling an error or warning, leading to silent interoperability problems
>>>> between SHACL implementations.
>>>>
>>>>
>>>>
>>>> The solution to this problem is to change the following syntax rules
>>>>
>>>>    path-metarule, path-non-recursive, path-predicate, path-sequence,
>>>>    path-alternative, path-inverse, path-zero-or-more, path-one-or-more, and
>>>>    path-zero-or-one
>>>>
>>>> to
>>>>
>>>> path-non-recursive A node p is not a well-formed SHACL property path if
>>>>        p is a blank node and any of the following rules require,
>>>>        directly or indirectly, determining whether p is a
>>>>        well-formed SHACL property path.
>>>>
>>>> path-metarule     A node is a well-formed SHACL property
>>>>     path if it satisfies exactly one of the following
>>>>     rules and if the node is a blank node it does not have a
>>>>     value for more than one of rdf:first or rdf:rest,
>>>>     sh:alternativePath, sh:inversePath, sh:zeroOrMorePath,
>>>>     sh:oneOrMorePath, and sh:zeroOrOnePath.
>>>>
>>>> path-predicate    A predicate path is any IRI.
>>>>
>>>> path-sequence    A sequence path is a blank node that is a SHACL list with
>>>>     at least two members and each member of the list is a
>>>>     well-formed SHACL property path.
>>>>
>>>> path-alternative  An alternative path is a blank node that has exactly one
>>>>     value for sh:alternativePath and that value is a SHACL
>>>>     list with at least two members and each member of the list
>>>>     is a well-formed SHACL property path.
>>>>
>>>> path-inverse      An inverse path is a blank node that has exactly one value
>>>>     for sh:inversePath and that value is a well-formed
>>>>     SHACL property path.
>>>>
>>>> path-zero-or-more A zero-or-more path is a blank node that has exactly one
>>>>     value for sh:zeroOrMorePath and that value is a
>>>>     well-formed SHACL property path.
>>>>
>>>> path-one-or-more  A one-or-more path is a blank node that has exactly one
>>>>     value for sh:oneOrMorePath and that value is a
>>>>     well-formed SHACL property path.
>>>>
>>>> path-zero-or-one A zero-or-one path is a blank node that has exactly one
>>>>    value for sh:zeroOrOnePath and that value is a
>>>>    well-formed SHACL property path.
>>>>
>>>> These changes to the syntax of SHACL results in a SHACL that is easier to
>>>> write, easier to understand, easier to generate, and with fewer
>>>> interoperability problems.
>>>>
>>>>
>>>>
>>>> Peter F. Patel-Schneider
>>>> Nuance Communications
>>>>
>>>
>>
>

Received on Wednesday, 3 May 2017 03:17:34 UTC