- From: James Hudson <jameshudson3010@gmail.com>
- Date: Tue, 21 Apr 2020 16:59:59 -0400
- To: Irene Polikoff <irene@topquadrant.com>
- Cc: Public Shacl W3C <public-shacl@w3.org>
- Message-ID: <CAEUVO9Ed9Va7zGN43fq2weKXJ9Nk-m5eFYWd33TwovsEyA9X9Q@mail.gmail.com>
Hello Irene,
On Tue, Apr 21, 2020 at 3:59 PM Irene Polikoff <irene@topquadrant.com>
wrote:
>
>
> On Apr 21, 2020, at 3:37 PM, James Hudson <jameshudson3010@gmail.com>
> wrote:
>
> The SPARQL query which does what I want is:
>
> SELECT DISTINCT ?s
> WHERE {
> {
> ?s ?p ?o .
> FILTER NOT EXISTS {
> ?s rdf:type* ?c .
> FILTER(?c IN (rdfs:Class, rdf:Property) && ?s NOT IN
> (rdfs:Class, rdf:Property) )
> }
> }
> }
>
>
> You can do all sort of things with shapes including complex expressions
> with sh:or, sh:not, etc.
>
My question is how...? That is what I am not yet seeing clearly yet with
SHACL, where I think I understand what SPARQL is doing and how.
> This query will return every subject whose rdf:type property path does not
> terminate in a rdfs:Class or rdf:Property. The subjects the SPARQL query
> will return are hr:missing, hr:typo, and hr:randomtype.
>
> 1. You require every resource that is a subject of any triples to have
> rdf:type that is either a class or a property. In which case hr:missing is
> correctly identified as missing a type
>
>
> This is not what I want because hr:Longer would not validate.
>
> There is no triple hr:Longer a rdfs:Class. There is hr:Longer a
> hr:Employee and hr:Employee a rdfs:Class. Therefore, there is a rdf:type
> property path which terminates in either a rdfs:Class or rdf:Property and
> hr:Longer is valid.
>
>
> You lost me here. Possibly, this requires more bandwidth that I have to
> discuss in e-mails
>
This is my train of thought. If I execute the following SPARQL query to
extract all of the triples defined in my data graph, I get the following
set of triples:
SELECT ?s ?p ?o
WHERE {
?s ?p ?o .
}
-----------------------------------------------------------------------------------------------------------------------------------------------
| s | p
| o |
===============================================================================================================================================
| <http://learningsparql.com/ns/humanResources#freestanding> |
sch:rangeIncludes | sch:Text
|
| <http://learningsparql.com/ns/humanResources#freestanding> | rdf:type
| rdf:Property |
| <http://learningsparql.com/ns/humanResources#Another> | rdf:type
| rdfs:Class |
| <http://learningsparql.com/ns/humanResources#typo> | rdfs:comment
| "some comment about typo" |
| <http://learningsparql.com/ns/humanResources#typo> | rdf:type
| rdfs:Classs |
| <http://learningsparql.com/ns/humanResources#typo> | rdfs:label
| "some label about typo" |
| <http://learningsparql.com/ns/humanResources#nosuper> |
sch:rangeIncludes | sch:Text
|
| <http://learningsparql.com/ns/humanResources#nosuper> | rdf:type
| rdf:Property |
| <http://learningsparql.com/ns/humanResources#nosuper> |
sch:domainIncludes | <http://learningsparql.com/ns/humanResources#Uncreated>
|
| <http://learningsparql.com/ns/humanResources#Employee> | rdfs:comment
| "a good employee" |
| <http://learningsparql.com/ns/humanResources#Employee> | rdf:type
| rdfs:Class |
| <http://learningsparql.com/ns/humanResources#Employee> | rdfs:label
| "model" |
| <http://learningsparql.com/ns/humanResources#randomtype> | rdfs:comment
| "some comment about randomtype" |
| <http://learningsparql.com/ns/humanResources#randomtype> | rdf:type
| <http://learningsparql.com/ns/humanResources#invalidtype> |
| <http://learningsparql.com/ns/humanResources#randomtype> | rdfs:label
| "some label about randomtype" |
| <http://learningsparql.com/ns/humanResources#Longer> | rdfs:comment
| "a good employee" |
| <http://learningsparql.com/ns/humanResources#Longer> | rdf:type
| <http://learningsparql.com/ns/humanResources#Employee> |
| <http://learningsparql.com/ns/humanResources#Longer> | rdfs:label
| "model" |
| <http://learningsparql.com/ns/humanResources#missing> | rdfs:comment
| "some comment about missing" |
| <http://learningsparql.com/ns/humanResources#name> | rdf:type
| rdf:Property |
| <http://learningsparql.com/ns/humanResources#name> |
sch:domainIncludes | <http://learningsparql.com/ns/humanResources#Employee>
|
-----------------------------------------------------------------------------------------------------------------------------------------------
I do not see the triple hr:Longer a rdfs:Class in that list.
I do see hr:Longer a hr:Employee and hr:Employee a rdfs:Class in the list.
Following the rdf:type property path, hr:Longer -> hr:Employee ->
rdfs:Class. The rdf:type property path starting with hr:Longer terminates
in rdfs:Class.
I am not sure how I can explain what I am thinking any better, so I hope
that was good enough.
Now, following what you said that sh:path [sh:zeroOrMorePath rdf:type]
emits three values hr:Longer (the zero), hr:Employee (1), rdfs:Class (2),
what I need is a way is for SHACL to only consider the last value found
(where the rdf:type property path terminates) and ignore the rest.
Put another way, perhaps (?), sh:path [sh:zeroOrMorePath rdf:type] results
in three triples to be evaluated:
(0) hr:Longer a hr:Longer (???)
(1) hr:Longer a hr:Employee
(2) hr:Longer a rdfs:Class
I need a way to tell SHACL to ignore everything except for the last one (2)
and to try to validate it.
Regards,
James
> On Tue, Apr 21, 2020 at 3:12 PM Irene Polikoff <irene@topquadrant.com>
> wrote:
>
>>
>>
>> On Apr 21, 2020, at 2:37 PM, James Hudson <jameshudson3010@gmail.com>
>> wrote:
>>
>> Hello Irene,
>>
>> I neglected to add:
>>
>> Yes, the target picks every resource that is a subject of a triple. This
>> is what I want.
>>
>> I also want to make sure that every subject has a property path that
>> terminates in a rdf:type of either rdfs:Class or rdf:Property.
>>
>>
>> What path? Saying that a path terminates in something does not specify a
>> path.
>>
>>
>> It is either
>> 1. You require every resource that is a subject of any triples to have
>> rdf:type that is either a class or a property. In which case hr:missing is
>> correctly identified as missing a type OR
>> 2. You require that every resource that is used as a value of rdf:type
>> has a type that is either a class or a property. In which case, use
>> sh:targetObjectsOf rdf:type OR
>> 3. You want to ensure that some resources have a type that is a class or
>> a property. It is not clear to me which resources they are. If you know
>> what your criteria is, what resources you want to exclude and include, then
>> it can be defined.
>>
>>
>> To account for subjects like hr:missing, I believe that sh:path
>> [sh:zeroOrMorePath rdf:type] ; is what I should be using. This does
>> result in a validation error for hr:missing.
>>
>>
>> This means
>>
>> ?s rdf:type+ ?type
>>
>> which when hr:Long is bound to ?s will return hr:Long, hr:Employee and
>> rdfs:Class. All of these values will be validated against the constraint
>> and hr:Long gives you an error because its type is not rdfs:Class. It is
>> hr:Employee.
>>
>> SHACL paths are the same as SPARQL paths
>> https://www.w3.org/TR/sparql11-property-paths/.
>>
>>
>> However, when using sh:path [sh:zeroOrMorePath rdf:type] ;, hr:Employee
>> produces a validation error because SHACL does not look at what the
>> rdf:type of hr:Employee is. I believe this is because of the zero part of
>> sh:zeroOrMorePath. When looking at hr:Employee, it only checks to see if
>> hr:Employee is either a rdfs:Class or rdf:Property and, because it is not,
>> it generates a validation error.
>>
>> Using sh:path ( rdf:type rdf:type ) ; or sh:path ( rdf:type
>> [sh:zeroOrMorePath rdf:type] ) or sh:path ( rdf:type [sh:oneOrMorePath
>> rdf:type] ) does not result in a validation error for hr:missing.
>>
>> Regards,
>> James
>>
>>
>> On Tue, Apr 21, 2020 at 2:11 PM Irene Polikoff <irene@topquadrant.com>
>> wrote:
>>
>>> Well, your target picks every resource that is a subject of a triple. I
>>> thought you wanted to make sure that they all have types. Since hr:missing
>>> does not have a type, you get a violation. That seems correct to me.
>>>
>>> If you simply wanted to say that any object in a triple with rdf:type
>>> predicate must itself have a type, then you do not need SPARQL based
>>> target. You could simply use sh:targetObjectsOf rdf:type.
>>>
>>> On Apr 21, 2020, at 1:39 PM, James Hudson <jameshudson3010@gmail.com>
>>> wrote:
>>>
>>> Hello Irene,
>>>
>>> Unfortunately, sh:path (rdf:type rdf:type); validates:
>>>
>>> hr:missing rdfs:comment "some comment about missing" .
>>>
>>> which does not have any value of rdf:type. This focus node should
>>> produce a validation error.
>>>
>>> I also believe that I would actually want ( rdf:type [sh:oneOrMorePath
>>> rdf:type] ) ; as the chain could be longer then just two. However, this
>>> does not resolve the problems.
>>>
>>> I tried:
>>>
>>> @prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
>>> @prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
>>> @prefix sch: <http://schema.org/> .
>>> @prefix sh: <http://www.w3.org/ns/shacl#> .
>>> @prefix ex: <http://example.org/> .
>>>
>>> ex:ClassOrProperty
>>> a sh:PropertyShape ;
>>> sh:target [
>>> a sh:SPARQLTarget ;
>>> sh:select """
>>> SELECT ?this
>>> WHERE {
>>> ?this ?p ?o .
>>> }
>>> """ ;
>>> ] ;
>>>
>>>
>>> sh:path ( rdf:type [sh:oneOrMorePath rdf:type] ) ;
>>> sh:in ( rdfs:Class rdf:Property ) ;
>>> sh:maxCount 1 ; # path-maxCount
>>> sh:minCount 1 ; # PropertyShape-path-minCount
>>>
>>> .
>>>
>>>
>>> Hoping that I could say to validate where the property path terminates
>>> and that it has to contain at least one value found in sh:in, but this
>>> produced the unwanted validation error:
>>>
>>> Constraint Violation in MinCountConstraintComponent (
>>> http://www.w3.org/ns/shacl#MinCountConstraintComponent):
>>> Severity: sh:Violation
>>> Source Shape: ex:ClassOrProperty
>>> Focus Node: hr:Employee
>>> Result Path: ( rdf:type rdf:type )
>>>
>>>
>>> The only thing I need to be able to do is to validate where the property
>>> path terminates and that does not seem possible with SHACL. Based on that,
>>> I have to believe that my sh:path should be sh:path [sh:zeroOrMorePath
>>> rdf:type] ; to account for focus nodes which do not have a rdf:type
>>> defined. Unfortunately, SHACL requires that every node along a path be
>>> validated with the same test and cannot just validate where the property
>>> path terminates.
>>>
>>> Regards,
>>> James
>>>
>>>
>>> On Tue, Apr 21, 2020 at 1:18 PM Irene Polikoff <irene@topquadrant.com>
>>> wrote:
>>>
>>>> No, I meant sequence path without any zero or more or one or more.
>>>> Simply rdf:type/rdf:type as opposed to rdf:type+/rdf:type which doesn’t
>>>> make much sense.
>>>>
>>>> sh:path (rdf:type rdf:type);
>>>>
>>>> See https://www.w3.org/TR/shacl/#property-paths
>>>>
>>>> On Apr 21, 2020, at 12:56 PM, James Hudson <jameshudson3010@gmail.com>
>>>> wrote:
>>>>
>>>> Hello Irene,
>>>>
>>>> Thank you for your quickly reply.
>>>>
>>>> If I try:
>>>>
>>>> @prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
>>>> @prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
>>>> @prefix sch: <http://schema.org/> .
>>>> @prefix sh: <http://www.w3.org/ns/shacl#> .
>>>> @prefix ex: <http://example.org/> .
>>>>
>>>> ex:ClassOrProperty
>>>> a sh:PropertyShape ;
>>>> sh:target [
>>>> a sh:SPARQLTarget ;
>>>> sh:select """
>>>> SELECT ?this
>>>> WHERE {
>>>> ?this ?p ?o .
>>>> }
>>>> """ ;
>>>> ] ;
>>>>
>>>>
>>>> sh:path ( [sh:zeroOrMorePath rdf:type] rdf:type ) ;
>>>> sh:in ( rdfs:Class rdf:Property ) ;
>>>> .
>>>>
>>>>
>>>> which is what I think you mean by "rdf:type/rdf:type as the path", I
>>>> still get the following unexpected validation error:
>>>>
>>>> Constraint Violation in InConstraintComponent (
>>>> http://www.w3.org/ns/shacl#InConstraintComponent):
>>>> Severity: sh:Violation
>>>> Source Shape: ex:ClassOrProperty
>>>> Focus Node: hr:Longer
>>>> Value Node: hr:Employee
>>>> Result Path: ( [ sh:zeroOrMorePath rdf:type ] rdf:type )
>>>>
>>>>
>>>> By unexpected, I mean I do not want it to be considered a validation
>>>> error because the rdf:type property path terminates at rdfs:Class.
>>>>
>>>> When you say "zero or more paths will deliver values hr:Long,
>>>> hr:Employee, rdfs:Class," does that mean that the sh:in test will be
>>>> performed on the value of hr:Long (fail), hr:Employee (fail), and
>>>> rdfs:Class (pass)? Is it possible to have it validate only where the
>>>> property path terminates?
>>>>
>>>> Regards,
>>>> James
>>>>
>>>> On Tue, Apr 21, 2020 at 12:12 PM Irene Polikoff <irene@topquadrant.com>
>>>> wrote:
>>>>
>>>>> This looks correct.
>>>>>
>>>>> With data:
>>>>>
>>>>> hr:Long a hr:Employee.
>>>>> hr:Employee a rdfs:Class.
>>>>>
>>>>> If your focus node is hr:Long, zero or more paths will deliver values
>>>>> hr:Long, hr:Employee, rdfs:Class. One or more paths will deliver values
>>>>> hr:Employee, rdfs:Class.
>>>>>
>>>>> You could try rdf:type/rdf:type as the path. This will get the type of
>>>>> a resource that is used as a type and ensure that it is rdfs:CLass or
>>>>> rdf:Property.
>>>>>
>>>>> On Apr 21, 2020, at 11:39 AM, James Hudson <jameshudson3010@gmail.com>
>>>>> wrote:
>>>>>
>>>>> Hello,
>>>>>
>>>>> Since people here have been so helpful in the past, I thought I would
>>>>> ask a few more questions.
>>>>>
>>>>> Background to this is my SO question at
>>>>> https://stackoverflow.com/questions/61323857/what-is-the-difference-between-these-shape-graphs-which-use-shor
>>>>>
>>>>> The SO question has the data graph under consideration.
>>>>>
>>>>> In the book Validating RDF, it says:
>>>>>
>>>>> Node shapes declare constraints directly on a node. Property shapes
>>>>> declare constraints on the values associated with a node through a path.
>>>>>
>>>>>
>>>>> Based on this, I believe I want to use a Property Shape because I want
>>>>> to define a constraint on the value of the rdf:type path on a focus node.
>>>>> Is this correct?
>>>>>
>>>>> If I try the property shape:
>>>>>
>>>>> @prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
>>>>> @prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
>>>>> @prefix sch: <http://schema.org/> .
>>>>> @prefix sh: <http://www.w3.org/ns/shacl#> .
>>>>> @prefix ex: <http://example.org/> .
>>>>>
>>>>> ex:ClassOrProperty
>>>>> a sh:PropertyShape ;
>>>>> sh:target [
>>>>> a sh:SPARQLTarget ;
>>>>> sh:select """
>>>>> SELECT ?this
>>>>> WHERE {
>>>>> ?this ?p ?o .
>>>>> }
>>>>> """ ;
>>>>> ] ;
>>>>>
>>>>>
>>>>> sh:path [sh:zeroOrMorePath rdf:type] ;
>>>>> sh:in ( rdfs:Class rdf:Property ) ;
>>>>> .
>>>>>
>>>>>
>>>>> I get the unexpected validation error:
>>>>> (J)
>>>>>
>>>>> Constraint Violation in InConstraintComponent (
>>>>> http://www.w3.org/ns/shacl#InConstraintComponent):
>>>>> Severity: sh:Violation
>>>>> Source Shape: ex:ClassOrProperty
>>>>> Focus Node: hr:Longer
>>>>> Value Node: hr:Employee
>>>>> Result Path: [ sh:zeroOrMorePath rdf:type ]
>>>>>
>>>>>
>>>>> The way I thought [sh:zeroOrMorePath rdf:type] ; would work is that
>>>>> it would consider the node hr:Longer and follow the rdf:type path through
>>>>> hr:Employee to where it terminates at rdfs:Class and then validate.
>>>>> However, it seems to stop one step away, sees that hr:Employee is not a
>>>>> rdfs:Class or rdf:Property and then generates a validation error.
>>>>>
>>>>> I get another unexpected validation error:
>>>>> (K)
>>>>>
>>>>> Constraint Violation in InConstraintComponent (
>>>>> http://www.w3.org/ns/shacl#InConstraintComponent):
>>>>> Severity: sh:Violation
>>>>> Source Shape: ex:ClassOrProperty
>>>>> Focus Node: hr:Employee
>>>>> Value Node: hr:Employee
>>>>> Result Path: [ sh:zeroOrMorePath rdf:type ]
>>>>>
>>>>>
>>>>> I was thinking that the zero in sh:zeroOrMorePath would see hr:Employee
>>>>> a rdfs:Class ; and validate. Is it the case that the zero in sh:zeroOrMorePath
>>>>> causes a validation engine to compare a node against itself without
>>>>> following or looking for the path?
>>>>>
>>>>> I did try using sh:oneOrMorePath, but I received the validation error
>>>>> (J) again, but (K) did not show up. Is the reason why (K) did not show up
>>>>> because it was forced to see hr:Employee a rdfs:Class ; because of
>>>>> the one in sh:oneOrMorePath and could validate it?
>>>>>
>>>>> Perhaps a validation engine validates every node along the path and
>>>>> not just where the path terminates? If this is the case, is it possible to
>>>>> validate where the path terminates only?
>>>>>
>>>>> Needless to say, I am rather confused.
>>>>>
>>>>> Can anyone clear this up?
>>>>>
>>>>> Thank you,
>>>>> James
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>
>>
>
Received on Tuesday, 21 April 2020 21:00:27 UTC