Re: Splitting Part 1 and Part 2 in TOC from Karen Coyle on 2015-04-07 (public-data-shapes-wg@w3.org from April 2015)

From: Karen Coyle <kcoyle@kcoyle.net>
Date: Mon, 06 Apr 2015 20:05:54 -0700
To: public-data-shapes-wg@w3.org
Message-ID: <55234992.3010601@kcoyle.net>
On 4/6/15 6:55 PM, Holger Knublauch wrote:

>>
>> Holger, I am still trying to understand what this other category is.
>> AFAIK, SHACL defines constraints on nodes, properties (in a graph) and
>> the value(s) of those properties. Properties themselves can be subject
>> to constraints, mainly cardinality, but I don't see this clearly
>> called out in the document. I would prefer:
>>
>> Constraints on nodes
>>   - cardinality ("only one person node allowed")
>>   - defined properties ("foaf:name")
>
> I don't think we have core language elements for those two built-in
> right now (not sure what you mean with "defined properties").

Those defined in the SHACL expression with sh:property.

>
>>
>>   - Constraints on defined properties ("foaf:name")
>>      - cardinality
>>
>>      - Constraints on property values (in this case, "foaf:name")
>>        -type
>>        - value list
>>
>> This is, in fact, how the examples read, although none show
>> constraints on nodes.
>
> I cannot tell whether such a further categorization would really help
> the logical flow. For example sh:hasValue sounds like it should be in
> the "property values" category, but it is really only firing a violation
> if the focus node does *not* have that value. So the division here is
> IMHO not clear cut, and I'd vote for reducing the levels of nesting.

First, the names are all up in the air, so we can call things whatever 
we want.

I do think that the nesting is logical from the point of view of someone 
who intends to perform validation on their data. Again, I point to the 
structure of the DCMI DSP,[1] which has descriptions (nodes) and 
statements (properties). The DSP can have one or more descriptions, the 
description one or more properties. (Note that SHACL so far has nothing 
really equivalent to the Description Set Profile of DCMI, but again I 
think that may be something solved through a user interface.) The 
logical flow from the view of a data developer who wants to use SHACL to 
validate data is:

My data/metadata describes some "thing." First, I want to list the 
properties that I am declaring will be used to describe that thing. 
Let's say the thing is a book, and a book is described with:

dct:title
dct:creator
dct:date

Each of these can have specified cardinalities:

dct:title min=1 max=1
dct:creator min=0 max=unbounded
dct:date min=0 max=1

That constrains the properties themselves, the triples that have that 
predicate. Once I've defined my properties, I want to say what type of 
object is "valid" for each one.

dct:title min=1 max=1
    valueType:literal
dct:creator min=0 max=unbounded
    valueType:URI
    [valid URI pattern: http://id.loc.gov/names/...]
dct:date min=0 max=1
    valueType:XMLSdate
    [valid values in range 1300-[present year + 1]]

I see this as a constraint on the object, not a constraint on the 
predicate, but I'm not sure it matters how it is implemented in a SHACL 
engine as long as it is clear to those using SHACL and gives the result 
desired. Logically, one is constraining the object values in the 
instance data triples. (And, yes, it gets complicated with recursion, 
but I leave that issue to Peter.)

Those, to me, are the steps that a metadata developer will take, from 
the broadest listing of properties to the specific definition of 
validatable object values. SHACL needs to support that workflow, IMO, 
and the first part of the SHACL document should be logically organized 
around that workflow. That doesn't mean that SHACL *language* itself has 
to be nested -- but the documentation should make those points clear to 
the reader.


>
>> Now, which of those does 4.0 refer to? And can we present SHACL with a
>> logical structure of this nature?
>
> Section 4 is currently only hosting sh:OrConstraint, and that may need
> to be in your "Constraints on nodes". I don't believe it would make
> sense to have that constraint type listed first, as it's usually just a
> combination of other constraints.

No, I think it does make sense, because it is a constraint at the 
highest level of the design workflow. Do I have a person shape or a book 
shape? Do I have both? This is the big picture of the data that will be 
validated, and it's one of the first decisions that a designer needs to 
make. It may be that for a design of a SHACL engine it is made up of 
other constraints, but as we've said, the majority of SHACL users will 
be folks who are writing shapes to validate their data. So the workflow 
logic of those users should guide the first part of the document. And I 
think that workflow will go from the biggest picture the designer has of 
their data (a macro-level entity-relation sort of view), down to the 
details.

kc
[1] http://dublincore.org/documents/dc-dsp/

>
> Holger
>
>
>

-- 
Karen Coyle
kcoyle@kcoyle.net http://kcoyle.net
m: 1-510-435-8234
skype: kcoylenet/+1-510-984-3600
Received on Tuesday, 7 April 2015 03:06:25 UTC