Re: Understanding Node vs Property Shapes and Property Paths

Hello Irene,

On Tue, Apr 21, 2020 at 3:59 PM Irene Polikoff <irene@topquadrant.com>
wrote:

>
>
> On Apr 21, 2020, at 3:37 PM, James Hudson <jameshudson3010@gmail.com>
> wrote:
>
> The SPARQL query which does what I want is:
>
> SELECT DISTINCT ?s
> WHERE {
>     {
>         ?s ?p ?o .
>         FILTER NOT EXISTS {
>             ?s rdf:type* ?c .
>              FILTER(?c IN (rdfs:Class, rdf:Property) && ?s NOT IN
> (rdfs:Class, rdf:Property) )
>         }
>     }
> }
>
>
> You can do all sort of things with shapes including complex expressions
> with sh:or, sh:not, etc.
>

My question is how...? That is what I am not yet seeing clearly yet with
SHACL, where I think I understand what SPARQL is doing and how.



> This query will return every subject whose rdf:type property path does not
> terminate in a rdfs:Class or rdf:Property. The subjects the SPARQL query
> will return are hr:missing, hr:typo, and hr:randomtype.
>
> 1. You require every resource that is a subject of any triples to have
> rdf:type that is either a class or a property. In which case hr:missing is
> correctly identified as missing a type
>
>
> This is not what I want because hr:Longer would not validate.
>
> There is no triple hr:Longer a rdfs:Class. There is hr:Longer a
> hr:Employee and hr:Employee a rdfs:Class. Therefore, there is a rdf:type
> property path which terminates in either a rdfs:Class or rdf:Property and
> hr:Longer is valid.
>
>
> You lost me here. Possibly, this requires more bandwidth that I have to
> discuss in e-mails
>

This is my train of thought. If I execute the following SPARQL query to
extract all of the triples defined in my data graph, I get the following
set of triples:

SELECT ?s ?p ?o
WHERE {
    ?s ?p ?o .
}

-----------------------------------------------------------------------------------------------------------------------------------------------
| s                                                          | p
       | o                                                         |
===============================================================================================================================================
| <http://learningsparql.com/ns/humanResources#freestanding> |
sch:rangeIncludes  | sch:Text
   |
| <http://learningsparql.com/ns/humanResources#freestanding> | rdf:type
      | rdf:Property                                              |
| <http://learningsparql.com/ns/humanResources#Another>      | rdf:type
      | rdfs:Class                                                |
| <http://learningsparql.com/ns/humanResources#typo>         | rdfs:comment
      | "some comment about typo"                                 |
| <http://learningsparql.com/ns/humanResources#typo>         | rdf:type
      | rdfs:Classs                                               |
| <http://learningsparql.com/ns/humanResources#typo>         | rdfs:label
      | "some label about typo"                                   |
| <http://learningsparql.com/ns/humanResources#nosuper>      |
sch:rangeIncludes  | sch:Text
   |
| <http://learningsparql.com/ns/humanResources#nosuper>      | rdf:type
      | rdf:Property                                              |
| <http://learningsparql.com/ns/humanResources#nosuper>      |
sch:domainIncludes | <http://learningsparql.com/ns/humanResources#Uncreated>
  |
| <http://learningsparql.com/ns/humanResources#Employee>     | rdfs:comment
      | "a good employee"                                         |
| <http://learningsparql.com/ns/humanResources#Employee>     | rdf:type
      | rdfs:Class                                                |
| <http://learningsparql.com/ns/humanResources#Employee>     | rdfs:label
      | "model"                                                   |
| <http://learningsparql.com/ns/humanResources#randomtype>   | rdfs:comment
      | "some comment about randomtype"                           |
| <http://learningsparql.com/ns/humanResources#randomtype>   | rdf:type
      | <http://learningsparql.com/ns/humanResources#invalidtype> |
| <http://learningsparql.com/ns/humanResources#randomtype>   | rdfs:label
      | "some label about randomtype"                             |
| <http://learningsparql.com/ns/humanResources#Longer>       | rdfs:comment
      | "a good employee"                                         |
| <http://learningsparql.com/ns/humanResources#Longer>       | rdf:type
      | <http://learningsparql.com/ns/humanResources#Employee>    |
| <http://learningsparql.com/ns/humanResources#Longer>       | rdfs:label
      | "model"                                                   |
| <http://learningsparql.com/ns/humanResources#missing>      | rdfs:comment
      | "some comment about missing"                              |
| <http://learningsparql.com/ns/humanResources#name>         | rdf:type
      | rdf:Property                                              |
| <http://learningsparql.com/ns/humanResources#name>         |
sch:domainIncludes | <http://learningsparql.com/ns/humanResources#Employee>
   |
-----------------------------------------------------------------------------------------------------------------------------------------------

I do not see the triple hr:Longer a rdfs:Class in that list.

I do see hr:Longer a hr:Employee and hr:Employee a rdfs:Class in the list.

Following the rdf:type property path, hr:Longer -> hr:Employee ->
rdfs:Class. The rdf:type property path starting with hr:Longer terminates
in rdfs:Class.

I am not sure how I can explain what I am thinking any better, so I hope
that was good enough.

Now, following what you said that sh:path [sh:zeroOrMorePath rdf:type]
emits three values hr:Longer (the zero), hr:Employee (1), rdfs:Class (2),
what I need is a way is for SHACL to only consider the last value found
(where the rdf:type property path terminates) and ignore the rest.

Put another way, perhaps (?), sh:path [sh:zeroOrMorePath rdf:type] results
in three triples to be evaluated:

(0) hr:Longer a hr:Longer (???)
(1) hr:Longer a hr:Employee
(2) hr:Longer a rdfs:Class

I need a way to tell SHACL to ignore everything except for the last one (2)
and to try to validate it.

Regards,
James


> On Tue, Apr 21, 2020 at 3:12 PM Irene Polikoff <irene@topquadrant.com>
> wrote:
>
>>
>>
>> On Apr 21, 2020, at 2:37 PM, James Hudson <jameshudson3010@gmail.com>
>> wrote:
>>
>> Hello Irene,
>>
>> I neglected to add:
>>
>> Yes, the target picks every resource that is a subject of a triple. This
>> is what I want.
>>
>> I also want to make sure that every subject has a property path that
>> terminates in a rdf:type of either rdfs:Class or rdf:Property.
>>
>>
>> What path? Saying that a path terminates in something does not specify a
>> path.
>>
>>
>> It is either
>> 1. You require every resource that is a subject of any triples to have
>> rdf:type that is either a class or a property. In which case hr:missing is
>> correctly identified as missing a type OR
>> 2. You require that every resource that is used as a value of rdf:type
>> has a type that is either a class or a property. In which case, use
>> sh:targetObjectsOf rdf:type OR
>> 3. You want to ensure that some resources have a type that is a class or
>> a property. It is not clear to me which resources they are. If you know
>> what your criteria is, what resources you want to exclude and include, then
>> it can be defined.
>>
>>
>> To account for subjects like hr:missing, I believe that sh:path
>> [sh:zeroOrMorePath rdf:type] ; is what I should be using. This does
>> result in a validation error for hr:missing.
>>
>>
>> This means
>>
>> ?s rdf:type+ ?type
>>
>> which when hr:Long is bound to ?s will return hr:Long, hr:Employee and
>> rdfs:Class. All of these values will be validated against the constraint
>> and hr:Long gives you an error because its type is not rdfs:Class. It is
>> hr:Employee.
>>
>> SHACL paths are the same as SPARQL paths
>> https://www.w3.org/TR/sparql11-property-paths/.
>>
>>
>> However, when using sh:path [sh:zeroOrMorePath rdf:type] ;,  hr:Employee
>> produces a validation error because SHACL does not look at what the
>> rdf:type of hr:Employee is. I believe this is because of the zero part of
>> sh:zeroOrMorePath. When looking at hr:Employee, it only checks to see if
>> hr:Employee is either a rdfs:Class or rdf:Property and, because it is not,
>> it generates a validation error.
>>
>> Using sh:path ( rdf:type rdf:type ) ; or sh:path  ( rdf:type
>> [sh:zeroOrMorePath rdf:type] ) or sh:path ( rdf:type [sh:oneOrMorePath
>> rdf:type] ) does not result in a validation error for hr:missing.
>>
>> Regards,
>> James
>>
>>
>> On Tue, Apr 21, 2020 at 2:11 PM Irene Polikoff <irene@topquadrant.com>
>> wrote:
>>
>>> Well, your target picks every resource that is a subject of a triple. I
>>> thought you wanted to make sure that they all have types. Since hr:missing
>>> does not have a type, you get a violation. That seems correct to me.
>>>
>>> If you simply wanted to say that any object in a triple with rdf:type
>>> predicate must itself have a type, then you do not need SPARQL based
>>> target. You could simply use sh:targetObjectsOf rdf:type.
>>>
>>> On Apr 21, 2020, at 1:39 PM, James Hudson <jameshudson3010@gmail.com>
>>> wrote:
>>>
>>> Hello Irene,
>>>
>>> Unfortunately, sh:path (rdf:type rdf:type); validates:
>>>
>>> hr:missing rdfs:comment "some comment about missing" .
>>>
>>> which does not have any value of rdf:type. This focus node should
>>> produce a validation error.
>>>
>>> I also believe that I would actually want ( rdf:type [sh:oneOrMorePath
>>> rdf:type] ) ;  as the chain could be longer then just two. However, this
>>> does not resolve the problems.
>>>
>>> I tried:
>>>
>>> @prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
>>> @prefix rdf:  <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
>>> @prefix sch:  <http://schema.org/> .
>>> @prefix sh:   <http://www.w3.org/ns/shacl#> .
>>> @prefix ex:  <http://example.org/> .
>>>
>>> ex:ClassOrProperty
>>>     a sh:PropertyShape ;
>>>     sh:target [
>>>         a sh:SPARQLTarget ;
>>>         sh:select   """
>>>                     SELECT ?this
>>>                     WHERE {
>>>                         ?this ?p ?o .
>>>                     }
>>>                     """ ;
>>>     ] ;
>>>
>>>
>>>     sh:path     ( rdf:type [sh:oneOrMorePath rdf:type] ) ;
>>>     sh:in       ( rdfs:Class rdf:Property ) ;
>>>     sh:maxCount 1 ;                 # path-maxCount
>>>     sh:minCount 1 ;                 # PropertyShape-path-minCount
>>>
>>> .
>>>
>>>
>>> Hoping that I could say to validate where the property path terminates
>>> and that it has to contain at least one value found in sh:in, but this
>>> produced the unwanted validation error:
>>>
>>> Constraint Violation in MinCountConstraintComponent (
>>> http://www.w3.org/ns/shacl#MinCountConstraintComponent):
>>> Severity: sh:Violation
>>> Source Shape: ex:ClassOrProperty
>>> Focus Node: hr:Employee
>>> Result Path: ( rdf:type rdf:type )
>>>
>>>
>>> The only thing I need to be able to do is to validate where the property
>>> path terminates and that does not seem possible with SHACL. Based on that,
>>> I have to believe that my sh:path should be sh:path [sh:zeroOrMorePath
>>> rdf:type] ; to account for focus nodes which do not have a rdf:type
>>> defined. Unfortunately, SHACL requires that every node along a path be
>>> validated with the same test and cannot just validate where the property
>>> path terminates.
>>>
>>> Regards,
>>> James
>>>
>>>
>>> On Tue, Apr 21, 2020 at 1:18 PM Irene Polikoff <irene@topquadrant.com>
>>> wrote:
>>>
>>>> No, I meant sequence path without any zero or more or one or more.
>>>> Simply rdf:type/rdf:type as opposed to rdf:type+/rdf:type which doesn’t
>>>> make much sense.
>>>>
>>>> sh:path (rdf:type rdf:type);
>>>>
>>>> See https://www.w3.org/TR/shacl/#property-paths
>>>>
>>>> On Apr 21, 2020, at 12:56 PM, James Hudson <jameshudson3010@gmail.com>
>>>> wrote:
>>>>
>>>> Hello Irene,
>>>>
>>>> Thank you for your quickly reply.
>>>>
>>>> If I try:
>>>>
>>>> @prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
>>>> @prefix rdf:  <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
>>>> @prefix sch:  <http://schema.org/> .
>>>> @prefix sh:   <http://www.w3.org/ns/shacl#> .
>>>> @prefix ex:  <http://example.org/> .
>>>>
>>>> ex:ClassOrProperty
>>>>     a sh:PropertyShape ;
>>>>     sh:target [
>>>>         a sh:SPARQLTarget ;
>>>>         sh:select   """
>>>>                     SELECT ?this
>>>>                     WHERE {
>>>>                         ?this ?p ?o .
>>>>                     }
>>>>                     """ ;
>>>>     ] ;
>>>>
>>>>
>>>>     sh:path     ( [sh:zeroOrMorePath rdf:type] rdf:type ) ;
>>>>     sh:in       ( rdfs:Class rdf:Property ) ;
>>>> .
>>>>
>>>>
>>>> which is what I think you mean by "rdf:type/rdf:type as the path", I
>>>> still get the following unexpected validation error:
>>>>
>>>> Constraint Violation in InConstraintComponent (
>>>> http://www.w3.org/ns/shacl#InConstraintComponent):
>>>> Severity: sh:Violation
>>>> Source Shape: ex:ClassOrProperty
>>>> Focus Node: hr:Longer
>>>> Value Node: hr:Employee
>>>> Result Path: ( [ sh:zeroOrMorePath rdf:type ] rdf:type )
>>>>
>>>>
>>>> By unexpected, I mean I do not want it to be considered a validation
>>>> error because the rdf:type property path terminates at rdfs:Class.
>>>>
>>>> When you say "zero or more paths will deliver values hr:Long,
>>>> hr:Employee, rdfs:Class," does that mean that the sh:in test will be
>>>> performed on the value of hr:Long (fail), hr:Employee (fail), and
>>>> rdfs:Class (pass)? Is it possible to have it validate only where the
>>>> property path terminates?
>>>>
>>>> Regards,
>>>> James
>>>>
>>>> On Tue, Apr 21, 2020 at 12:12 PM Irene Polikoff <irene@topquadrant.com>
>>>> wrote:
>>>>
>>>>> This looks correct.
>>>>>
>>>>> With data:
>>>>>
>>>>> hr:Long a hr:Employee.
>>>>> hr:Employee a rdfs:Class.
>>>>>
>>>>> If your focus node is hr:Long, zero or more paths will deliver values
>>>>> hr:Long, hr:Employee, rdfs:Class. One or more paths will deliver values
>>>>>  hr:Employee, rdfs:Class.
>>>>>
>>>>> You could try rdf:type/rdf:type as the path. This will get the type of
>>>>> a resource that is used as a type and ensure that it is rdfs:CLass or
>>>>> rdf:Property.
>>>>>
>>>>> On Apr 21, 2020, at 11:39 AM, James Hudson <jameshudson3010@gmail.com>
>>>>> wrote:
>>>>>
>>>>> Hello,
>>>>>
>>>>> Since people here have been so helpful in the past, I thought I would
>>>>> ask a few more questions.
>>>>>
>>>>> Background to this is my SO question at
>>>>> https://stackoverflow.com/questions/61323857/what-is-the-difference-between-these-shape-graphs-which-use-shor
>>>>>
>>>>> The SO question has the data graph under consideration.
>>>>>
>>>>> In the book Validating RDF, it says:
>>>>>
>>>>> Node shapes declare constraints directly on a node. Property shapes
>>>>> declare constraints on the values associated with a node through a path.
>>>>>
>>>>>
>>>>> Based on this, I believe I want to use a Property Shape because I want
>>>>> to define a constraint on the value of the rdf:type path on a focus node.
>>>>> Is this correct?
>>>>>
>>>>> If I try the property shape:
>>>>>
>>>>> @prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
>>>>> @prefix rdf:  <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
>>>>> @prefix sch:  <http://schema.org/> .
>>>>> @prefix sh:   <http://www.w3.org/ns/shacl#> .
>>>>> @prefix ex:   <http://example.org/> .
>>>>>
>>>>> ex:ClassOrProperty
>>>>>     a sh:PropertyShape ;
>>>>>     sh:target [
>>>>>         a sh:SPARQLTarget ;
>>>>>         sh:select   """
>>>>>                     SELECT ?this
>>>>>                     WHERE {
>>>>>                         ?this ?p ?o .
>>>>>                     }
>>>>>                     """ ;
>>>>>     ] ;
>>>>>
>>>>>
>>>>>     sh:path [sh:zeroOrMorePath rdf:type] ;
>>>>>     sh:in ( rdfs:Class rdf:Property ) ;
>>>>> .
>>>>>
>>>>>
>>>>> I get the unexpected validation error:
>>>>> (J)
>>>>>
>>>>> Constraint Violation in InConstraintComponent (
>>>>> http://www.w3.org/ns/shacl#InConstraintComponent):
>>>>> Severity: sh:Violation
>>>>> Source Shape: ex:ClassOrProperty
>>>>> Focus Node: hr:Longer
>>>>> Value Node: hr:Employee
>>>>> Result Path: [ sh:zeroOrMorePath rdf:type ]
>>>>>
>>>>>
>>>>> The way I thought [sh:zeroOrMorePath rdf:type] ; would work is that
>>>>> it would consider the node hr:Longer and follow the rdf:type path through
>>>>> hr:Employee to where it terminates at rdfs:Class and then validate.
>>>>> However, it seems to stop one step away, sees that hr:Employee is not a
>>>>> rdfs:Class or rdf:Property and then generates a validation error.
>>>>>
>>>>> I get another unexpected validation error:
>>>>> (K)
>>>>>
>>>>> Constraint Violation in InConstraintComponent (
>>>>> http://www.w3.org/ns/shacl#InConstraintComponent):
>>>>> Severity: sh:Violation
>>>>> Source Shape: ex:ClassOrProperty
>>>>> Focus Node: hr:Employee
>>>>> Value Node: hr:Employee
>>>>> Result Path: [ sh:zeroOrMorePath rdf:type ]
>>>>>
>>>>>
>>>>> I was thinking that the zero in sh:zeroOrMorePath would see hr:Employee
>>>>> a rdfs:Class ; and validate. Is it the case that the zero in sh:zeroOrMorePath
>>>>> causes a validation engine to compare a node against itself without
>>>>> following or looking for the path?
>>>>>
>>>>> I did try using sh:oneOrMorePath, but I received the validation error
>>>>> (J) again, but (K) did not show up. Is the reason why (K) did not show up
>>>>> because it was forced to see hr:Employee a rdfs:Class ; because of
>>>>> the one in sh:oneOrMorePath and could validate it?
>>>>>
>>>>> Perhaps a validation engine validates every node along the path and
>>>>> not just where the path terminates? If this is the case, is it possible to
>>>>> validate where the path terminates only?
>>>>>
>>>>> Needless to say, I am rather confused.
>>>>>
>>>>> Can anyone clear this up?
>>>>>
>>>>> Thank you,
>>>>> James
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>
>>
>

Received on Tuesday, 21 April 2020 21:00:27 UTC