Re: Splitting Part 1 and Part 2 in TOC from Holger Knublauch on 2015-04-07 (public-data-shapes-wg@w3.org from April 2015)

From: Holger Knublauch <holger@topquadrant.com>
Date: Tue, 07 Apr 2015 15:04:17 +1000
To: public-data-shapes-wg@w3.org
Message-ID: <55236551.7010905@topquadrant.com>
On 4/7/2015 13:05, Karen Coyle wrote:
>
>
> On 4/6/15 6:55 PM, Holger Knublauch wrote:
>
>>>
>>> Holger, I am still trying to understand what this other category is.
>>> AFAIK, SHACL defines constraints on nodes, properties (in a graph) and
>>> the value(s) of those properties. Properties themselves can be subject
>>> to constraints, mainly cardinality, but I don't see this clearly
>>> called out in the document. I would prefer:
>>>
>>> Constraints on nodes
>>>   - cardinality ("only one person node allowed")
>>>   - defined properties ("foaf:name")
>>
>> I don't think we have core language elements for those two built-in
>> right now (not sure what you mean with "defined properties").
>
> Those defined in the SHACL expression with sh:property.

Ok, got it, that would be something like

sh:MyShape
     sh:property [
         sh:predicate ex:myProperty ;
         rdfs:label "my property" ;
     ]

with or without further restrictions. Just a way to state what is 
relevant for a shape. But then we need to start with that section, and 
not with sh:OrConstraint as you seem to prefer below.

>
>>
>>>
>>>   - Constraints on defined properties ("foaf:name")
>>>      - cardinality
>>>
>>>      - Constraints on property values (in this case, "foaf:name")
>>>        -type
>>>        - value list
>>>
>>> This is, in fact, how the examples read, although none show
>>> constraints on nodes.
>>
>> I cannot tell whether such a further categorization would really help
>> the logical flow. For example sh:hasValue sounds like it should be in
>> the "property values" category, but it is really only firing a violation
>> if the focus node does *not* have that value. So the division here is
>> IMHO not clear cut, and I'd vote for reducing the levels of nesting.
>
> First, the names are all up in the air, so we can call things whatever 
> we want.
>
> I do think that the nesting is logical from the point of view of 
> someone who intends to perform validation on their data. Again, I 
> point to the structure of the DCMI DSP,[1] which has descriptions 
> (nodes) and statements (properties). The DSP can have one or more 
> descriptions, the description one or more properties. (Note that SHACL 
> so far has nothing really equivalent to the Description Set Profile of 
> DCMI, but again I think that may be something solved through a user 
> interface.) The logical flow from the view of a data developer who 
> wants to use SHACL to validate data is:
>
> My data/metadata describes some "thing." First, I want to list the 
> properties that I am declaring will be used to describe that thing. 
> Let's say the thing is a book, and a book is described with:
>
> dct:title
> dct:creator
> dct:date
>
> Each of these can have specified cardinalities:
>
> dct:title min=1 max=1
> dct:creator min=0 max=unbounded
> dct:date min=0 max=1

I'd welcome diffs from anyone, or even completely rewritten sections. I 
find it difficult to imagine how this would fit together well. For 
example did you notice that sh:hasValue also implies sh:minCount>0 ? So 
where to draw that line?

>
> That constrains the properties themselves, the triples that have that 
> predicate. Once I've defined my properties, I want to say what type of 
> object is "valid" for each one.
>
> dct:title min=1 max=1
>    valueType:literal
> dct:creator min=0 max=unbounded
>    valueType:URI
>    [valid URI pattern: http://id.loc.gov/names/...]
> dct:date min=0 max=1
>    valueType:XMLSdate
>    [valid values in range 1300-[present year + 1]]

BTW above you seem to be mixing sh:datatype, sh:valueType and 
sh:nodeKind/Type all into a single property "valueType".

>
> I see this as a constraint on the object, not a constraint on the 
> predicate, but I'm not sure it matters how it is implemented in a 
> SHACL engine as long as it is clear to those using SHACL and gives the 
> result desired. Logically, one is constraining the object values in 
> the instance data triples. (And, yes, it gets complicated with 
> recursion, but I leave that issue to Peter.)
>
> Those, to me, are the steps that a metadata developer will take, from 
> the broadest listing of properties to the specific definition of 
> validatable object values. SHACL needs to support that workflow, IMO, 
> and the first part of the SHACL document should be logically organized 
> around that workflow. That doesn't mean that SHACL *language* itself 
> has to be nested -- but the documentation should make those points 
> clear to the reader.
>
>
>>
>>> Now, which of those does 4.0 refer to? And can we present SHACL with a
>>> logical structure of this nature?
>>
>> Section 4 is currently only hosting sh:OrConstraint, and that may need
>> to be in your "Constraints on nodes". I don't believe it would make
>> sense to have that constraint type listed first, as it's usually just a
>> combination of other constraints.
>
> No, I think it does make sense, because it is a constraint at the 
> highest level of the design workflow. Do I have a person shape or a 
> book shape? Do I have both? This is the big picture of the data that 
> will be validated, and it's one of the first decisions that a designer 
> needs to make. It may be that for a design of a SHACL engine it is 
> made up of other constraints, but as we've said, the majority of SHACL 
> users will be folks who are writing shapes to validate their data. So 
> the workflow logic of those users should guide the first part of the 
> document. And I think that workflow will go from the biggest picture 
> the designer has of their data (a macro-level entity-relation sort of 
> view), down to the details.

SHACL users can only insert sh:OrConstraint into another shape. Compare 
to the other features, I believe it will be rarely used. I don't 
understand why this would be part of the big picture - "Do I have a 
person shape or a book shape" is not answered by sh:OrConstraint, but 
rather by the instance pointing at its shape, via rdf:type or sh:nodeShape.

Holger
Received on Tuesday, 7 April 2015 05:05:39 UTC