Re: Understanding Node vs Property Shapes and Property Paths

> On Apr 21, 2020, at 3:37 PM, James Hudson <jameshudson3010@gmail.com> wrote:
> 
> Hello Irene,
> 
> What path?  sh:path [sh:zeroOrMorePath rdf:type] ;

Path is used to reach some values.

As I explained, this path, given your data and hr:Long as a starting point, reaches 3 values that are then validated: hr:Long, hr:Employee, rdfs:Class.

> 
> I believe what this path resolves to is either:
> 
> 1. nothing or the subject itself if a subject does not have a rdf:type, as with the case of hr:missing.
The subject itself - hr::Long
> 2. rdf:type/rdf:type* if the subject does have a rdf:type
Hr:Employee because hr:Long rdf:type hr:Employee
Rdfs:Class because her:Long rdf:type hr:Employee. hr:Employee rdf:type rdfs:Class.
> 
> The SPARQL query which does what I want is:
> 
> SELECT DISTINCT ?s
> WHERE {
>     {
>         ?s ?p ?o .
>         FILTER NOT EXISTS {
>             ?s rdf:type* ?c .
>              FILTER(?c IN (rdfs:Class, rdf:Property) && ?s NOT IN (rdfs:Class, rdf:Property) )
>         }
>     }
> }
> 
You can do all sort of things with shapes including complex expressions with sh:or, sh:not, etc. 

> This query will return every subject whose rdf:type property path does not terminate in a rdfs:Class or rdf:Property. The subjects the SPARQL query will return are hr:missing, hr:typo, and hr:randomtype.
> 
> 1. You require every resource that is a subject of any triples to have rdf:type that is either a class or a property. In which case hr:missing is correctly identified as missing a type
> 
> This is not what I want because hr:Longer would not validate. 
> 
> There is no triple hr:Longer a rdfs:Class. There is hr:Longer a hr:Employee and hr:Employee a rdfs:Class. Therefore, there is a rdf:type property path which terminates in either a rdfs:Class or rdf:Property and hr:Longer is valid.

You lost me here. Possibly, this requires more bandwidth that I have to discuss in e-mails
> 
> 2. You require that every resource that is used as a value of rdf:type has a type that is either a class or a property. In which case, use sh:targetObjectsOf rdf:type
> 
> This is not what I want because hr:missing would not generate a validation error because there is no triple containing hr:missing as a subject with rdf:type as a predicate.
> 
> 3. You want to ensure that some resources have a type that is a class or a property. It is not clear to me which resources they are. If you know what your criteria is, what resources you want to exclude and include, then it can be defined.
> 
> This is not what I want because all (not some) subjects must have a rdf:type property path which terminates in either rdfs:Class or rdf:Property.
> 
> I apologize if I am being dense, but this is harder than I expected to wrap my head around.
> 
> Regards,
> James
> 
> 
> 
> Regards,
> James
> 
> 
> On Tue, Apr 21, 2020 at 3:12 PM Irene Polikoff <irene@topquadrant.com <mailto:irene@topquadrant.com>> wrote:
> 
> 
>> On Apr 21, 2020, at 2:37 PM, James Hudson <jameshudson3010@gmail.com <mailto:jameshudson3010@gmail.com>> wrote:
>> 
>> Hello Irene,
>> 
>> I neglected to add:
>> 
>> Yes, the target picks every resource that is a subject of a triple. This is what I want.
>> 
>> I also want to make sure that every subject has a property path that terminates in a rdf:type of either rdfs:Class or rdf:Property.
> 
> What path? Saying that a path terminates in something does not specify a path. 
> 
> 
> It is either 
> 1. You require every resource that is a subject of any triples to have rdf:type that is either a class or a property. In which case hr:missing is correctly identified as missing a type OR 
> 2. You require that every resource that is used as a value of rdf:type has a type that is either a class or a property. In which case, use sh:targetObjectsOf rdf:type OR
> 3. You want to ensure that some resources have a type that is a class or a property. It is not clear to me which resources they are. If you know what your criteria is, what resources you want to exclude and include, then it can be defined.
>> 
>> To account for subjects like hr:missing, I believe that sh:path [sh:zeroOrMorePath rdf:type] ; is what I should be using. This does result in a validation error for hr:missing. 
> 
> This means
> 
> ?s rdf:type+ ?type 
> 
> which when hr:Long is bound to ?s will return hr:Long, hr:Employee and rdfs:Class. All of these values will be validated against the constraint and hr:Long gives you an error because its type is not rdfs:Class. It is hr:Employee.
> 
> SHACL paths are the same as SPARQL paths https://www.w3.org/TR/sparql11-property-paths/ <https://www.w3.org/TR/sparql11-property-paths/>. 
>> 
>> 
>> However, when using sh:path [sh:zeroOrMorePath rdf:type] ;,  hr:Employee produces a validation error because SHACL does not look at what the rdf:type of hr:Employee is. I believe this is because of the zero part of sh:zeroOrMorePath. When looking at hr:Employee, it only checks to see if hr:Employee is either a rdfs:Class or rdf:Property and, because it is not, it generates a validation error.
>> 
>> Using sh:path ( rdf:type rdf:type ) ; or sh:path  ( rdf:type [sh:zeroOrMorePath rdf:type] ) or sh:path ( rdf:type [sh:oneOrMorePath rdf:type] ) does not result in a validation error for hr:missing.
>> 
>> Regards,
>> James
>> 
>> 
>> On Tue, Apr 21, 2020 at 2:11 PM Irene Polikoff <irene@topquadrant.com <mailto:irene@topquadrant.com>> wrote:
>> Well, your target picks every resource that is a subject of a triple. I thought you wanted to make sure that they all have types. Since hr:missing does not have a type, you get a violation. That seems correct to me.
>> 
>> If you simply wanted to say that any object in a triple with rdf:type predicate must itself have a type, then you do not need SPARQL based target. You could simply use sh:targetObjectsOf rdf:type.
>> 
>>> On Apr 21, 2020, at 1:39 PM, James Hudson <jameshudson3010@gmail.com <mailto:jameshudson3010@gmail.com>> wrote:
>>> 
>>> Hello Irene,
>>> 
>>> Unfortunately, sh:path (rdf:type rdf:type); validates:
>>> 
>>> hr:missing rdfs:comment "some comment about missing" .
>>> 
>>> which does not have any value of rdf:type. This focus node should produce a validation error.
>>> 
>>> I also believe that I would actually want ( rdf:type [sh:oneOrMorePath rdf:type] ) ;  as the chain could be longer then just two. However, this does not resolve the problems.
>>> 
>>> I tried:
>>> 
>>> @prefix rdfs: <http://www.w3.org/2000/01/rdf-schema# <http://www.w3.org/2000/01/rdf-schema#>> .
>>> @prefix rdf:  <http://www.w3.org/1999/02/22-rdf-syntax-ns# <http://www.w3.org/1999/02/22-rdf-syntax-ns#>> .
>>> @prefix sch:  <http://schema.org/ <http://schema.org/>> .
>>> @prefix sh:   <http://www.w3.org/ns/shacl# <http://www.w3.org/ns/shacl#>> .
>>> @prefix ex:  <http://example.org/ <http://example.org/>> .
>>> 
>>> ex:ClassOrProperty
>>>     a sh:PropertyShape ;
>>>     sh:target [
>>>         a sh:SPARQLTarget ;
>>>         sh:select   """
>>>                     SELECT ?this
>>>                     WHERE {
>>>                         ?this ?p ?o .
>>>                     }
>>>                     """ ;
>>>     ] ;
>>> 
>>> 
>>>     sh:path     ( rdf:type [sh:oneOrMorePath rdf:type] ) ;
>>>     sh:in       ( rdfs:Class rdf:Property ) ;
>>>     sh:maxCount 1 ;                 # path-maxCount
>>>     sh:minCount 1 ;                 # PropertyShape-path-minCount
>>> 
>>> .
>>> 
>>> Hoping that I could say to validate where the property path terminates and that it has to contain at least one value found in sh:in, but this produced the unwanted validation error:
>>> 
>>> Constraint Violation in MinCountConstraintComponent (http://www.w3.org/ns/shacl#MinCountConstraintComponent <http://www.w3.org/ns/shacl#MinCountConstraintComponent>):
>>>  Severity: sh:Violation
>>>  Source Shape: ex:ClassOrProperty
>>>  Focus Node: hr:Employee
>>>  Result Path: ( rdf:type rdf:type )
>>> 
>>> The only thing I need to be able to do is to validate where the property path terminates and that does not seem possible with SHACL. Based on that, I have to believe that my sh:path should be sh:path [sh:zeroOrMorePath rdf:type] ; to account for focus nodes which do not have a rdf:type defined. Unfortunately, SHACL requires that every node along a path be validated with the same test and cannot just validate where the property path terminates.
>>> 
>>> Regards,
>>> James
>>> 
>>> 
>>> On Tue, Apr 21, 2020 at 1:18 PM Irene Polikoff <irene@topquadrant.com <mailto:irene@topquadrant.com>> wrote:
>>> No, I meant sequence path without any zero or more or one or more. Simply rdf:type/rdf:type as opposed to rdf:type+/rdf:type which doesn’t make much sense.
>>> 
>>> sh:path (rdf:type rdf:type);
>>> 
>>> See https://www.w3.org/TR/shacl/#property-paths <https://www.w3.org/TR/shacl/#property-paths>
>>> 
>>>> On Apr 21, 2020, at 12:56 PM, James Hudson <jameshudson3010@gmail.com <mailto:jameshudson3010@gmail.com>> wrote:
>>>> 
>>>> Hello Irene,
>>>> 
>>>> Thank you for your quickly reply.
>>>> 
>>>> If I try:
>>>> 
>>>> @prefix rdfs: <http://www.w3.org/2000/01/rdf-schema# <http://www.w3.org/2000/01/rdf-schema#>> .
>>>> @prefix rdf:  <http://www.w3.org/1999/02/22-rdf-syntax-ns# <http://www.w3.org/1999/02/22-rdf-syntax-ns#>> .
>>>> @prefix sch:  <http://schema.org/ <http://schema.org/>> .
>>>> @prefix sh:   <http://www.w3.org/ns/shacl# <http://www.w3.org/ns/shacl#>> .
>>>> @prefix ex:  <http://example.org/ <http://example.org/>> .
>>>> 
>>>> ex:ClassOrProperty
>>>>     a sh:PropertyShape ;
>>>>     sh:target [
>>>>         a sh:SPARQLTarget ;
>>>>         sh:select   """
>>>>                     SELECT ?this
>>>>                     WHERE {
>>>>                         ?this ?p ?o .
>>>>                     }
>>>>                     """ ;
>>>>     ] ;
>>>> 
>>>> 
>>>>     sh:path     ( [sh:zeroOrMorePath rdf:type] rdf:type ) ;
>>>>     sh:in       ( rdfs:Class rdf:Property ) ;
>>>> .
>>>> 
>>>> which is what I think you mean by "rdf:type/rdf:type as the path", I still get the following unexpected validation error:
>>>> 
>>>> Constraint Violation in InConstraintComponent (http://www.w3.org/ns/shacl#InConstraintComponent <http://www.w3.org/ns/shacl#InConstraintComponent>):
>>>>  Severity: sh:Violation
>>>>  Source Shape: ex:ClassOrProperty
>>>>  Focus Node: hr:Longer
>>>>  Value Node: hr:Employee
>>>>  Result Path: ( [ sh:zeroOrMorePath rdf:type ] rdf:type )
>>>> 
>>>> By unexpected, I mean I do not want it to be considered a validation error because the rdf:type property path terminates at rdfs:Class.
>>>> 
>>>> When you say "zero or more paths will deliver values hr:Long, hr:Employee, rdfs:Class," does that mean that the sh:in test will be performed on the value of hr:Long (fail), hr:Employee (fail), and rdfs:Class (pass)? Is it possible to have it validate only where the property path terminates?
>>>> 
>>>> Regards,
>>>> James
>>>> 
>>>> On Tue, Apr 21, 2020 at 12:12 PM Irene Polikoff <irene@topquadrant.com <mailto:irene@topquadrant.com>> wrote:
>>>> This looks correct.
>>>> 
>>>> With data:
>>>> 
>>>> hr:Long a hr:Employee.
>>>> hr:Employee a rdfs:Class.
>>>> 
>>>> If your focus node is hr:Long, zero or more paths will deliver values hr:Long, hr:Employee, rdfs:Class. One or more paths will deliver values  hr:Employee, rdfs:Class.
>>>> 
>>>> You could try rdf:type/rdf:type as the path. This will get the type of a resource that is used as a type and ensure that it is rdfs:CLass or rdf:Property.
>>>> 
>>>>> On Apr 21, 2020, at 11:39 AM, James Hudson <jameshudson3010@gmail.com <mailto:jameshudson3010@gmail.com>> wrote:
>>>>> 
>>>>> Hello,
>>>>> 
>>>>> Since people here have been so helpful in the past, I thought I would ask a few more questions.
>>>>> 
>>>>> Background to this is my SO question at https://stackoverflow.com/questions/61323857/what-is-the-difference-between-these-shape-graphs-which-use-shor <https://stackoverflow.com/questions/61323857/what-is-the-difference-between-these-shape-graphs-which-use-shor>
>>>>> 
>>>>> The SO question has the data graph under consideration.
>>>>> 
>>>>> In the book Validating RDF, it says:
>>>>> 
>>>>> Node shapes declare constraints directly on a node. Property shapes declare constraints on the values associated with a node through a path.
>>>>> 
>>>>> Based on this, I believe I want to use a Property Shape because I want to define a constraint on the value of the rdf:type path on a focus node. Is this correct?
>>>>> 
>>>>> If I try the property shape:
>>>>> 
>>>>> @prefix rdfs: <http://www.w3.org/2000/01/rdf-schema# <http://www.w3.org/2000/01/rdf-schema#>> .
>>>>> @prefix rdf:  <http://www.w3.org/1999/02/22-rdf-syntax-ns# <http://www.w3.org/1999/02/22-rdf-syntax-ns#>> .
>>>>> @prefix sch:  <http://schema.org/ <http://schema.org/>> .
>>>>> @prefix sh:   <http://www.w3.org/ns/shacl# <http://www.w3.org/ns/shacl#>> .
>>>>> @prefix ex:   <http://example.org/ <http://example.org/>> .
>>>>> 
>>>>> ex:ClassOrProperty
>>>>>     a sh:PropertyShape ;
>>>>>     sh:target [
>>>>>         a sh:SPARQLTarget ;
>>>>>         sh:select   """
>>>>>                     SELECT ?this
>>>>>                     WHERE {
>>>>>                         ?this ?p ?o .
>>>>>                     }
>>>>>                     """ ;
>>>>>     ] ;
>>>>> 
>>>>> 
>>>>>     sh:path [sh:zeroOrMorePath rdf:type] ;
>>>>>     sh:in ( rdfs:Class rdf:Property ) ;
>>>>> .
>>>>> 
>>>>> I get the unexpected validation error:
>>>>> (J)
>>>>> Constraint Violation in InConstraintComponent (http://www.w3.org/ns/shacl#InConstraintComponent <http://www.w3.org/ns/shacl#InConstraintComponent>):
>>>>>  Severity: sh:Violation
>>>>>  Source Shape: ex:ClassOrProperty
>>>>>  Focus Node: hr:Longer
>>>>>  Value Node: hr:Employee
>>>>>  Result Path: [ sh:zeroOrMorePath rdf:type ]
>>>>> 
>>>>> The way I thought [sh:zeroOrMorePath rdf:type] ; would work is that it would consider the node hr:Longer and follow the rdf:type path through hr:Employee to where it terminates at rdfs:Class and then validate. However, it seems to stop one step away, sees that hr:Employee is not a rdfs:Class or rdf:Property and then generates a validation error.
>>>>> 
>>>>> I get another unexpected validation error:
>>>>> (K)
>>>>> Constraint Violation in InConstraintComponent (http://www.w3.org/ns/shacl#InConstraintComponent <http://www.w3.org/ns/shacl#InConstraintComponent>):
>>>>>  Severity: sh:Violation
>>>>>  Source Shape: ex:ClassOrProperty
>>>>>  Focus Node: hr:Employee
>>>>>  Value Node: hr:Employee
>>>>>  Result Path: [ sh:zeroOrMorePath rdf:type ]
>>>>> 
>>>>> I was thinking that the zero in sh:zeroOrMorePath would see hr:Employee a rdfs:Class ; and validate. Is it the case that the zero in sh:zeroOrMorePath causes a validation engine to compare a node against itself without following or looking for the path?
>>>>> 
>>>>> I did try using sh:oneOrMorePath, but I received the validation error (J) again, but (K) did not show up. Is the reason why (K) did not show up because it was forced to see hr:Employee a rdfs:Class ; because of the one in sh:oneOrMorePath and could validate it?
>>>>> 
>>>>> Perhaps a validation engine validates every node along the path and not just where the path terminates? If this is the case, is it possible to validate where the path terminates only?
>>>>> 
>>>>> Needless to say, I am rather confused.
>>>>> 
>>>>> Can anyone clear this up?
>>>>> 
>>>>> Thank you,
>>>>> James
>>>>> 
>>>>> 
>>>>> 
>>>> 
>>> 
>> 
> 

Received on Tuesday, 21 April 2020 20:00:02 UTC