Re: shapes-ISSUE-209 (What is a shape?): What is a shape [SHACL Spec] from Dimitris Kontokostas on 2016-11-28 (public-data-shapes-wg@w3.org from November 2016)

From: Dimitris Kontokostas <kontokostas@informatik.uni-leipzig.de>
Date: Mon, 28 Nov 2016 12:43:15 +0200
To: Holger Knublauch <holger@topquadrant.com>
Cc: public-data-shapes-wg <public-data-shapes-wg@w3.org>
Message-ID: <CA+u4+a3HhSm1F91_TCKBMQA-+gua4MncfkR-5MXqOLx0fLFTtQ@mail.gmail.com>
Hi Karen,

I find the graph-based definition not so fitting for SHACL.
When we say that a shape is an RDF node, then we also imply that a shape is
part of a graph (the RDF graph that contains that node).
The node has arcs from / to the node and this is how we define targets,
constraints etc already

The graph-based definition would also create a confusion wrt the shapes
graph.

On Mon, Nov 28, 2016 at 2:28 AM, Holger Knublauch <holger@topquadrant.com>
wrote:

>
>
> On 28/11/2016 8:33, Karen Coyle wrote:
>
>> There is a simple solution to this, and it follows in part the example of
>> the Annotations Working Group. Their spec defines a :
>>
>> ***
>> An Annotation is a rooted, directed graph that represents a relationship
>> between resources.
>> There are two primary types of resource that participate in this
>> relationship, Bodies and Targets.
>> Annotations have 0 or more Bodies.
>> Annotations have 1 or more Targets.
>> ****
>>
>> All that needs to be done is to define "shape" as a graph whose root is a
>> subject is of type sh:Shape, and has 0 or more targets and 1 or more
>> constraints.
>>
>
> This would not be correct. A shape doesn't need one or more constraints.
> sh:Shape is a subclass of sh:Constraint (which makes every shape
> automatically also a constraint), but this doesn't "do" anything by itself.
> Also the zero or more doesn't add anything. (Shapes may also have labels
> etc).
>
> As always, the more complexity we are adding here, someone will find
> problems with it. And pointing out issues is trivial compared to getting
> everything right. That's why I would leave out anything that isn't formally
> needed. We can leave such explanatory prose to other documents, and if
> these other documents prefer to understand a shape as a collection of
> specific triples then fine.
>
> Holger
>
>
>
>
>> Really, that's all that is needed.
>>
>> kc
>> p.s. I like the idea of the shape having a "root" node. I'm not sure if
>> something needs to be said about targets, which are also of sh:Shape - is
>> it necessary to say that they are not what is meant when the spec talks
>> about "shapes"?
>>
>> On 11/27/16 7:16 AM, Karen Coyle wrote:
>>
>>>
>>>
>>> On 11/26/16 3:08 PM, Holger Knublauch wrote:
>>>
>>>>
>>>>
>>>> On 25/11/2016 5:41, Irene Polikoff wrote:
>>>>
>>>>> I took the initial question to be "when is a resource (or a node) a
>>>>> shape?". And not to be "what are all the triples that describe a
>>>>> shape?".
>>>>>
>>>>> To me, these are two completely different questions. Colloquially
>>>>> speaking, shape may be said to be a set of triples. But writing a spec
>>>>> requires us to be precise. And precisely speaking, shape is not a set
>>>>> of triples, it is a node. Information about it is
>>>>> described/specified/defined using a set of triples.
>>>>>
>>>>> Thus, I would recommend closing the first question as resolved.
>>>>>
>>>>> As for the second question, why does it need to be answered? A more
>>>>> meaningful question may be "when is a shape graph sufficiently
>>>>> complete to be able to process it and what should a SHACL processor do
>>>>> when a shapes graph doesn't have all the necessary information"?
>>>>>
>>>>> For example, let's say a shapes graph contains only the following
>>>>> triples:
>>>>>
>>>>> ex:PersonShape
>>>>>         a sh:Shape ;
>>>>>         sh:targetClass ex:Person ;
>>>>>         sh:property [
>>>>>             sh:predicate ex:worksFor;
>>>>>             sh:shape ex:OrganizationShape;
>>>>> ].
>>>>>
>>>>> This, to me, would be insufficient to do anything with since to
>>>>> validate data against it, we need to have a description of
>>>>> ex:OrganizationShape. What should happen in such cases?
>>>>>
>>>>
>>>> This case is covered by the spec. It's simply a shape without
>>>> constraints, i.e. every node conforms to it. Yet it's syntactically
>>>> valid because the expected type of sh:shape is sh:Shape.
>>>>
>>>>
>>>>>
>>>>> Other examples mentioned by Karen:
>>>>>
>>>>> ex:ExampleShapeWithPropertyConstraints
>>>>>         a sh:Shape ;
>>>>>         sh:property [
>>>>>                 sh:predicate ex:email ;
>>>>>                 sh:name "e-mail" ;
>>>>>                 sh:description "We need at least one email value" ;
>>>>>                 sh:minCount 1 ;
>>>>>         ] ;
>>>>>         sh:property [
>>>>>                 sh:path (ex:knows ex:email) ;
>>>>>                 sh:name "Friend's e-mail" ;
>>>>>                 sh:description "We need at least one email for
>>>>> everyone you know" ;
>>>>>                 sh:minCount 1 ;
>>>>>         ] .
>>>>>
>>>>> There is only one shape - ex:ExampleShapeWithPropertyConstraints. If
>>>>> this is all the content of a shapes graph, it is fine. All the
>>>>> information needed for validation is here.
>>>>>
>>>>> ex:MyShape
>>>>>         a sh:Shape ;
>>>>>         sh:targetNode ex:MyInstance ;
>>>>>         sh:property [
>>>>>                 # Violations of sh:minCount and sh:datatype are
>>>>> produced as warnings
>>>>>                 sh:predicate ex:myProperty ;
>>>>>                 sh:minCount 1 ;
>>>>>                 sh:datatype xsd:string ;
>>>>>                 sh:severity sh:Warning ;
>>>>>         ] ;
>>>>>
>>>>> One shape - ex:MyShape. It refers to a target node. Target node is not
>>>>> a shape, it identifies a node in a data graph that is to be validated
>>>>> against a shape, so I am not sure what is the question.
>>>>>
>>>>> ex:PersonShape
>>>>>         a sh:Shape ;
>>>>>         sh:targetClass ex:Person ;
>>>>>         sh:property ex:PersonShape-name .
>>>>>
>>>>> ex:PersonShape-name
>>>>>         a sh:PropertyConstraint ;
>>>>>         sh:predicate ex:name ;
>>>>>         sh:minCount 1 ;
>>>>>         sh:deactivated true .
>>>>>
>>>>> There is only one shape - ex:PersonShape. All the information
>>>>> necessary for validation is present and, in fact, there would be
>>>>> automatic conformance since the only constraint is disabled. However,
>>>>> if a shapes graph only contained the following, validation would not
>>>>> be possible (I think):
>>>>>
>>>>> ex:PersonShape
>>>>>         a sh:Shape ;
>>>>>         sh:targetClass ex:Person ;
>>>>>         sh:property ex:PersonShape-name .
>>>>>
>>>>> So, again, what should happen in these cases?
>>>>>
>>>>> All the examples above are for a single shape. In some of the examples
>>>>> information about it is complete enough to validate data against the
>>>>> shape. In others, it is not. We should also consider the fact that a
>>>>> shapes graph can and often will contain multiple shapes and some
>>>>> shapes may have sufficient description to validate against and others
>>>>> may not. Also, some shapes may have some information that can be
>>>>> checked and some that can't be. For example, given the shape graph
>>>>> below we would know enough to check conformance of values of ex:email,
>>>>> but not know enough to check values of ex:worksFor.
>>>>>
>>>>> ex:PersonShape
>>>>>         a sh:Shape ;
>>>>>         sh:targetClass ex:Person ;
>>>>>         sh:property [
>>>>>             sh:predicate ex:worksFor;
>>>>>             sh:shape ex:OrganizationShape;
>>>>> ];
>>>>>
>>>>>         sh:property [
>>>>>                 sh:path (ex:knows ex:email) ;
>>>>>                 sh:name "Friend's e-mail" ;
>>>>>                 sh:description "We need at least one email for
>>>>> everyone you know" ;
>>>>>                 sh:minCount 1 ;
>>>>>         ] .
>>>>>
>>>>> So, one proposal may be as follows:
>>>>>
>>>>> If a shapes graph contains any shapes that are insufficiently
>>>>> specified, processing doesn't happen and SHACL engine returns an error.
>>>>>
>>>>
>>>> The spec mentions several conditions under which a shapes graph is
>>>> invalid. These are typically written as "a shape must ...".
>>>>
>>>> On the more general topic of node vs triples, I find this rather
>>>> philosophical. The spec is pretty clear about what happens in each
>>>> context. Which triples "belong" to a shape is following from that, but
>>>> not valuable on its own right.
>>>>
>>>
>>> The spec is not "pretty clear" and this is not "philosophical" - it has
>>> to be clear *to all* what is being described in the spec. Insisting that
>>> the spec is ok when others are saying it is not is one of the things
>>> that is making the progress very slow. It is incredibly generous of
>>> people to be putting time into trying to make it actually clear, but,
>>> personally, I'm running out of steam. However, note that DCMI will not
>>> approve a highly flawed spec.
>>>
>>> kc
>>>
>>>
>>>> Holger
>>>>
>>>>
>>>>
>>>>> Another proposal may be:
>>>>>
>>>>> Data is processed against shapes that are sufficiently specified.
>>>>> Warning is issued regarding shapes that are insufficiently specified
>>>>> and, thus, ignored.
>>>>>
>>>>> Yet another proposal may be:
>>>>>
>>>>> Data is processed against shapes that are sufficiently specified.
>>>>> Warning is issued regarding shapes that are insufficiently specified.
>>>>> SHACL processor will perform as much validation as it able to against
>>>>> insufficiently specified shapes.
>>>>>
>>>>> Of course, we would need to define what it means to be "insufficiently
>>>>> specified".
>>>>>
>>>>>
>>>>> On Thu, Nov 24, 2016 at 12:29 PM, Karen Coyle <kcoyle@kcoyle.net
>>>>> <mailto:kcoyle@kcoyle.net>> wrote:
>>>>>
>>>>>     Actually, I don't think this solves the problem that came up at
>>>>>     the meeting. As we discussed in the meeting, the conflict is
>>>>>     between a node, which is a single IRI, and a graph, which is a set
>>>>>     of triples. Throughout the document, the term "shape" is used to
>>>>>     refer to more than a single IRI.
>>>>>
>>>>>     The statement below could be used for how a shape is identified
>>>>>     (although I think we should discuss that further) but does not
>>>>>     define how one finds the finite boundaries of the set of triples
>>>>>     that is used as an instrument to define the validation rules that
>>>>>     will be applied to a data graph.
>>>>>
>>>>>     Something that was said in the meeting made me think that defining
>>>>>     where a shape ends is as important as defining where it begins.
>>>>>     (Note that in this case I am speaking of a shape as a set of
>>>>>     triples, not a node - we know where a node ends, because it is a
>>>>>     single point.)
>>>>>
>>>>>     In an example like this (taken from the spec), I assume that this
>>>>>     represents a single shape:
>>>>>
>>>>>     ex:ExampleShapeWithPropertyConstraints
>>>>>             a sh:Shape ;
>>>>>             sh:property [
>>>>>                     sh:predicate ex:email ;
>>>>>                     sh:name "e-mail" ;
>>>>>                     sh:description "We need at least one email value" ;
>>>>>                     sh:minCount 1 ;
>>>>>             ] ;
>>>>>             sh:property [
>>>>>                     sh:path (ex:knows ex:email) ;
>>>>>                     sh:name "Friend's e-mail" ;
>>>>>                     sh:description "We need at least one email for
>>>>>     everyone you know" ;
>>>>>                     sh:minCount 1 ;
>>>>>             ] .
>>>>>
>>>>>     Where there is a target, which is also a shape, is this one or two
>>>>>     shapes? And if two, what are the boundaries of each?
>>>>>
>>>>>     ex:MyShape
>>>>>             a sh:Shape ;
>>>>>             sh:targetNode ex:MyInstance ;
>>>>>             sh:property [
>>>>>                     # Violations of sh:minCount and sh:datatype are
>>>>>     produced as warnings
>>>>>                     sh:predicate ex:myProperty ;
>>>>>                     sh:minCount 1 ;
>>>>>                     sh:datatype xsd:string ;
>>>>>                     sh:severity sh:Warning ;
>>>>>             ] ;
>>>>>
>>>>>     The following is an example of the case that I believe was
>>>>>     intended at the meeting. The question is whether this is one shape
>>>>>     or two? And if it is two, how is that distinguished from the shape
>>>>>     immediately above that has a target?
>>>>>
>>>>>     ex:PersonShape
>>>>>             a sh:Shape ;
>>>>>             sh:targetClass ex:Person ;
>>>>>             sh:property ex:PersonShape-name .
>>>>>
>>>>>     ex:PersonShape-name
>>>>>             a sh:PropertyConstraint ;
>>>>>             sh:predicate ex:name ;
>>>>>             sh:minCount 1 ;
>>>>>             sh:deactivated true .
>>>>>
>>>>>     If this seems petty, remember that throughout, the document refers
>>>>>     to a thing called "shape" and all of the understanding of the
>>>>>     document depends on the reader understanding exactly what that
>>>>> means.
>>>>>
>>>>>     kc
>>>>>
>>>>>     On 11/23/16 7:34 PM, Holger Knublauch wrote:
>>>>>
>>>>>         Done.
>>>>>
>>>>>         Thanks,
>>>>>         Holger
>>>>>
>>>>>
>>>>>         On 24/11/2016 13:32, Irene Polikoff wrote:
>>>>>
>>>>>             I suggest changing
>>>>>
>>>>>             <A shape can be a node
>>>>>
>>>>> <https://mail.google.com/mail/u/0/#m_1017120090268237992_dfn-node
>>>>>
>>>>> <https://mail.google.com/mail/u/0/#m_1017120090268237992_dfn-node>>
>>>>>             in
>>>>>             a shapes graph
>>>>>
>>>>> <https://mail.google.com/mail/u/0/#m_1017120090268237992_dfn
>>>>> -shapes-graph
>>>>>
>>>>>
>>>>> <https://mail.google.com/mail/u/0/#m_1017120090268237992_dfn
>>>>> -shapes-graph>>.
>>>>>
>>>>>             A node is a shape if and only if it fulfills either of the
>>>>>             following
>>>>>             conditions in the shapes graph:>
>>>>>
>>>>>             to
>>>>>
>>>>>             <A shape is a node
>>>>>
>>>>> <https://mail.google.com/mail/u/0/#m_1017120090268237992_dfn-node
>>>>>
>>>>> <https://mail.google.com/mail/u/0/#m_1017120090268237992_dfn-node>>
>>>>>             in
>>>>>             a shapes graph
>>>>>
>>>>> <https://mail.google.com/mail/u/0/#m_1017120090268237992_dfn
>>>>> -shapes-graph
>>>>>
>>>>>
>>>>> <https://mail.google.com/mail/u/0/#m_1017120090268237992_dfn
>>>>> -shapes-graph>>
>>>>>
>>>>>             that
>>>>>             fulfills either of the following conditions:
>>>>>
>>>>>             On Wed, Nov 23, 2016 at 7:48 PM, Holger Knublauch
>>>>>             <holger@topquadrant.com <mailto:holger@topquadrant.com>
>>>>>             <mailto:holger@topquadrant.com
>>>>>             <mailto:holger@topquadrant.com>>> wrote:
>>>>>
>>>>>                 The current definition in 2.1 reads
>>>>>
>>>>>                 A shape can be a node
>>>>> <#m_1017120090268237992_dfn-node> in
>>>>>                 a shapes graph
>>>>>             <#m_1017120090268237992_dfn-shapes-graph> that is
>>>>>                 a SHACL instance
>>>>> <#m_1017120090268237992_dfn-shacl-instance> of
>>>>>             |sh:Shape|; or it
>>>>>                 can be a node so that the expected type
>>>>> <#m_1017120090268237992_dfn-expected-type> of the node
>>>>>                 is |sh:Shape|, or a node that has a value
>>>>>                 <#m_1017120090268237992_dfn-values> for a target
>>>>>                 <#m_1017120090268237992_dfn-target> property such
>>>>>                 as |sh:targetClass| in the shapes graph
>>>>> <#m_1017120090268237992_dfn-shapes-graph>.
>>>>>
>>>>>                 These are all (3) ways of how shapes are identified. I
>>>>>             have just
>>>>>                 added some precision based on the newly introduced term
>>>>>                 shape-expecting constraint parameters, and explicitly
>>>>>             enumerated
>>>>>                 the target properties. The definition now reads
>>>>>
>>>>>                 A shape can be a node
>>>>> <#m_1017120090268237992_dfn-node> in
>>>>>                 a shapes graph
>>>>>             <#m_1017120090268237992_dfn-shapes-graph>. A node
>>>>>                 is a shape if and only if it fulfills either of the
>>>>>             following
>>>>>                 conditions in the shapes graph:
>>>>>
>>>>>                   * the node is a SHACL instance
>>>>> <#m_1017120090268237992_dfn-shacl-instance> of
>>>>>             |sh:Shape|
>>>>>                   * the node has the expected type
>>>>> <#m_1017120090268237992_dfn-expected-type>
>>>>>             |sh:Shape|, which
>>>>>                     is the case if it is used as a value of
>>>>>             shape-expecting
>>>>>                     constraint parameters
>>>>>
>>>>>
>>>>> <#m_1017120090268237992_dfn-shape-expecting-constraint-parameters>
>>>>>             such
>>>>>                     as |sh:shape| (in the case of the list-valued
>>>>>                     parameters |sh:and|, |sh:or| and |sh:partition| it
>>>>>             must be a
>>>>>                     member of the corresponding lists)
>>>>>                   * the node has a value
>>>>>             <#m_1017120090268237992_dfn-values> for
>>>>>                     any of the target
>>>>> <#m_1017120090268237992_dfn-target> properties
>>>>>             |sh:targetClass|, |sh:targetNode|, |sh:targetObjectsOf|,
>>>>>             |sh:targetSubjectsOf| and |sh:target|
>>>>>
>>>>>
>>>>>                 Change:
>>>>>
>>>>>
>>>>>
>>>>> https://github.com/w3c/data-shapes/commit/bec7b6852529acc809
>>>>> 54dbc38cf4e435861238a2
>>>>>
>>>>>
>>>>> <https://github.com/w3c/data-shapes/commit/bec7b6852529acc80
>>>>> 954dbc38cf4e435861238a2>
>>>>>
>>>>>
>>>>>
>>>>> <https://github.com/w3c/data-shapes/commit/bec7b6852529acc80
>>>>> 954dbc38cf4e435861238a2
>>>>>
>>>>>
>>>>> <https://github.com/w3c/data-shapes/commit/bec7b6852529acc80
>>>>> 954dbc38cf4e435861238a2>>
>>>>>
>>>>>
>>>>>                 I'd appreciate if WG members could double check this
>>>>>             definition.
>>>>>                 Meanwhile I have turned the change above into a
>>>>>             PROPOSAL for a
>>>>>                 future meeting:
>>>>>
>>>>>
>>>>>
>>>>> https://www.w3.org/2014/data-shapes/wiki/Proposals#ISSUE-209
>>>>> :_What_is_a_shape
>>>>>
>>>>>
>>>>> <https://www.w3.org/2014/data-shapes/wiki/Proposals#ISSUE-20
>>>>> 9:_What_is_a_shape>
>>>>>
>>>>>
>>>>>
>>>>> <https://www.w3.org/2014/data-shapes/wiki/Proposals#ISSUE-20
>>>>> 9:_What_is_a_shape
>>>>>
>>>>>
>>>>> <https://www.w3.org/2014/data-shapes/wiki/Proposals#ISSUE-20
>>>>> 9:_What_is_a_shape>>
>>>>>
>>>>>
>>>>>                 Thanks,
>>>>>                 Holger
>>>>>
>>>>>
>>>>>
>>>>>                 On 24/11/2016 9:49, Irene Polikoff wrote:
>>>>>
>>>>>                     I believe the question is "How do I know that a
>>>>>                 node is a
>>>>>                     shape?". The spec says that it is "typically" a
>>>>>                 SHACL instance of
>>>>>                     sh:Shape. This is one way, but not the definitive
>>>>>                 way (because of
>>>>>                     "typically") to determine that something is a
>>>>> shape.
>>>>>
>>>>>                     What are other ways? E.g., any subject of a triple
>>>>>                 with one of
>>>>>                     the SHACL target or constraint predicates is a
>>>>> shape.
>>>>>
>>>>>                     On Sun, Nov 20, 2016 at 3:58 PM, RDF Data Shapes
>>>>>                 Working Group
>>>>>                     Issue Tracker <sysbot+tracker@w3.org
>>>>>                 <mailto:sysbot%2Btracker@w3.org>
>>>>>                     <mailto:sysbot+tracker@w3.org
>>>>> <mailto:sysbot%2Btracker@w3.org>>> wrote:
>>>>>
>>>>>                         shapes-ISSUE-209 (What is a shape?): What is a
>>>>>                 shape [SHACL Spec]
>>>>>
>>>>>
>>>>> http://www.w3.org/2014/data-shapes/track/issues/209
>>>>> <http://www.w3.org/2014/data-shapes/track/issues/209>
>>>>>
>>>>> <http://www.w3.org/2014/data-shapes/track/issues/209
>>>>> <http://www.w3.org/2014/data-shapes/track/issues/209>>
>>>>>
>>>>>                         Raised by: Karen Coyle
>>>>>                         On product: SHACL Spec
>>>>>
>>>>>                         Peter's mail:
>>>>>
>>>>>
>>>>> https://lists.w3.org/Archives/Public/public-rdf-shapes/2016O
>>>>> ct/0029.html
>>>>>
>>>>> <https://lists.w3.org/Archives/Public/public-rdf-shapes/
>>>>> 2016Oct/0029.html>
>>>>>
>>>>>
>>>>>
>>>>> <https://lists.w3.org/Archives/Public/public-rdf-shapes/
>>>>> 2016Oct/0029.html
>>>>>
>>>>>
>>>>> <https://lists.w3.org/Archives/Public/public-rdf-shapes/
>>>>> 2016Oct/0029.html>>
>>>>>
>>>>>
>>>>>                         "Just what are shapes?
>>>>>
>>>>>                         The terminology section says:
>>>>>
>>>>>                         "Shape
>>>>>                         A shape is a node in a shapes graph that is
>>>>>                 typically a SHACL
>>>>>                         instance of
>>>>>                         sh:Shape. A shape provides a collection of
>>>>>                 targets, filters,
>>>>>                         constraints and
>>>>>                         parameters of constraint components that
>>>>>                 specify how a data
>>>>>                         graph is
>>>>>                         validated against the shape. Shapes can also
>>>>>                 provide
>>>>>                         non-validating
>>>>>                         information, such as labels and comments."
>>>>>
>>>>>                         Section 2 says:
>>>>>
>>>>>                         "Shapes define constraints that a set of focus
>>>>>                 nodes can be
>>>>>                         validated
>>>>>                         against."
>>>>>
>>>>>                         This doesn't, however, provide guidance in
>>>>>                 determining what
>>>>>                         the shapes in a
>>>>>                         shapes graph are."
>>>>>
>>>>>                         (more in the email)
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>     --
>>>>>     Karen Coyle
>>>>>     kcoyle@kcoyle.net <mailto:kcoyle@kcoyle.net> http://kcoyle.net
>>>>>     m: 1-510-435-8234 <tel:1-510-435-8234>
>>>>>     skype: kcoylenet/+1-510-984-3600 <tel:%2B1-510-984-3600>
>>>>>
>>>>>
>>>>>
>>>>
>>>
>>
>
>


-- 
Dimitris Kontokostas
Department of Computer Science, University of Leipzig & DBpedia Association
Projects: http://dbpedia.org, http://rdfunit.aksw.org,
http://aligned-project.eu
Homepage: http://aksw.org/DimitrisKontokostas
Research Group: AKSW/KILT http://aksw.org/Groups/KILT
Received on Monday, 28 November 2016 10:44:21 UTC