Re: propose to make repeated-properties additive from Holger Knublauch on 2015-09-21 (public-data-shapes-wg@w3.org from September 2015)

From: Holger Knublauch <holger@topquadrant.com>
Date: Mon, 21 Sep 2015 10:50:24 +1000
To: public-data-shapes-wg@w3.org
Message-ID: <55FF5450.2080609@topquadrant.com>
On 9/21/2015 6:19, Karen Coyle wrote:
>
>
> On 9/20/15 1:39 AM, Holger Knublauch wrote:
>> On 9/19/15 5:02 PM, Karen Coyle wrote:
>>> 2) repeated properties
>>> This is a real and not uncommon example:
>>>
>>>   <bf_Person1>
>>>   bf:identifiedBy <http://id.loc.gov/authorities/names/n80103961#RWO> ;
>>>     #IRI from id.loc.gov, min 1, max 1
>>>   bf:identifiedBy <https://viaf.org/viaf/268367832/#Knape,_Joachim> .
>>>     #IRI from viaf.org, min 1, max unlimited
>>
>> I agree it makes sense to talk about the requirements before requesting
>> a change to the language. If I understand the intention correctly, then
>> the above could be expressed with an incremental addition to the core
>> library:
>>
>> ex:MyShape
>>      a sh:Shape ;
>>      sh:property [
>>          sh:predicate bf:identifiedBy ;
>>          sh:qualifiedMinCount 1 ;
>>          sh:qualifiedMaxCount 1 ;
>>          sh:qualifiedValueShape [
>>              sh:constraint [
>>                  a sh:URIPatternConstraint ;
>>                  sh:uriPattern "^http://id.loc.gov" ;
>>              ] ;
>>          ] ;
>>      ] ;
>>      sh:property [
>>          sh:predicate bf:identifiedBy ;
>>          sh:qualifiedMinCount 1 ;
>>          sh:qualifiedValueShape [
>>              sh:constraint [
>>                  a sh:URIPatternConstraint ;
>>                  sh:uriPattern "^http://viaf.org" ;
>>              ] ;
>>          ] ;
>>      ] .
>>
>> The new feature that would be needed would be sh:URIPatternConstraint -
>> the current sh:pattern only applies to property values "one hop away"
>> while here we would need something that talks about the IRI of the focus
>> node itself. We had a similar topic recently with regards to
>> sh:allowedValues. It may make sense to generalize the validation
>> function mechanism so that the same infrastructure can be reused, but in
>> the end this is about syntactic sugar only.
>
> In fact, the question was less about the URI pattern than about the 
> ability to have different values for the same property, and to 
> constrain them separately. So we should pick some other value 
> constraints to use that SHACL already handles -- perhaps that the 
> first instance is an IRI and the second is a literal.

Ok, this just highlights that we need more test cases. A property that 
can (meaningfully) take either literals or IRIs as values is not a use 
case that I have seen yet. I am aware that schema.org allows that, but 
schema.org can afford that liberty because they have quite a lot of 
post-processing machinery, e.g. to turn a country code string into a 
Country instance.

Anyway, let's look at what it would take to express such things. I 
believe we need to continue to keep general property constraints and 
qualified property constraints separate, because the majority of 
constraints is not qualified but applies to all values. In order to 
express your use case above, we would need to extend the 
sh:qualifiedValueShape mechanism so that it can also work with literals. 
Currently my implicit assumption was that the focus node cannot be a 
literal, but this is probably an unnecessary restriction. I have just 
opened a ticket for that micro decision.

Then, we could more cleanly separate the concepts of property value 
restrictions, and restrictions on the focus node itself. Taking 
sh:nodeKind as an example, we could then describe your scenario using

ex:MyShape
     a sh:Shape ;
     rdfs:comment "someProperty must have one IRI and one or more 
Literals" ;
     sh:property [
         sh:predicate ex:someProperty ;
         sh:qualifiedMinCount 1 ;
         sh:qualifiedMaxCount 1 ;
         sh:qualifiedValueShape [
             sh:constraint [
                 sh:NodeConstraint ;
                 sh:nodeKind sh:IRI ;
             ]
         ]
     ] ;
     sh:property [
         sh:predicate ex:someProperty ;
         sh:qualifiedMinCount 1 ;
         sh:qualifiedValueShape [
             sh:constraint [
                 sh:NodeConstraint ;
                 sh:nodeKind sh:Literal ;
                 sh:datatype xsd:string ;
             ]
         ]
     ] .

In the design above I have am suggesting a template sh:NodeConstraint 
which combines the various node-related constraint types:

- sh:allowedValues
- sh:class
- sh:datatype
- sh:directType
- sh:minLength
- sh:maxLength
- sh:nodeKind
- sh:maxExclusive etc
- sh:pattern

All of these are backed by the same sh:ValidationFunctions as the 
property constraints, so it's just another syntax for the same thing. 
(But we would need to reopen the discussion on the naming of 
sh:valueClass vs. sh:class).

Does this sound like the right direction?

> Looking at the above, I think both would fail when the "other" value 
> is evaluated.
>
> Wouldn't this be a case that could use sh:filterShape? That is, there 
> would be a separate sh:filterShape for the two different cases. By 
> using the filter, only one value type would be included in the graph 
> being evaluated.

I don't see how a filterShape would help here.

Holger
Received on Monday, 21 September 2015 00:50:59 UTC