Re: How would option b) on the last straw poll of 12 March work? from Karen Coyle on 2015-03-15 (public-data-shapes-wg@w3.org from March 2015)

From: Karen Coyle <kcoyle@kcoyle.net>
Date: Sun, 15 Mar 2015 09:11:22 -0700
To: public-data-shapes-wg@w3.org
Message-ID: <5505AF2A.9070200@kcoyle.net>
On 3/15/15 7:59 AM, Eric Prud'hommeaux wrote:
> * Karen Coyle <kcoyle@kcoyle.net> [2015-03-15 07:09-0700]
>>
>>
>> On 3/14/15 5:34 PM, Eric Prud'hommeaux wrote:
>>> As a down-payment, I offer<http://w3c.github.io/data-shapes/semantics/>.
>>> I hope to produce a start on an axiomatic semantics and a SPARQL semantics
>>> tomorrow.
>>
>>
>> Eric, I think there are some basics that we need to settle on before
>> getting more deeply into the writing of documents. Reading the
>> introduction to this, I have the following questions/exceptions:
>>
>>> SHACL (Shapes Constraint Language) provides structural constraints
>> for RDF graphs.
>>
>> *structural only? I don't consider all of the validation rules
>> (which I prefer to "constraints") to be structural, e.g. the rules
>> governing values.
>
> The problem with "validation rules" is that it really is
> unbounded. When I see "validation rules" I think of lots of
> domain-specific rule sets like the SDTM validator which examines a
> cell in one table and runs a bevy of tests to see if has corresponding
> entries in another cell, that those in tern have some other properties
> with appropriate values, etc.

Here's another confusion that I see in the plethora of documents that 
come out of this group, and that I think we should make a decision on: 
is SHACL a set of rules for making decisions? or is it a standard for 
the method of implementation of those rules? I don't see a clear 
distinction between the semantics of the rules and a description of how 
one implements the (as yet unstated) rules. This gets to the question of 
"abstract" vs. "??" (whatever else it is that isn't being done as an 
abstract specification), and it also speaks to "validation rules" (which 
are rules) vs. "constraints" (which I see as being actions take based on 
rules). It seems to me that some folks are working on how 
constraints/validation will be done before we've decided *what* 
constraints/validation will be done. I think the "what" needs to come 
before the "how", which is why the abstract model has appeal. There 
might be another way to achieve that, but what I'm seeing now is a set 
of implementations that aren't necessarily based on the same goals. As 
Irene says, this isn't helpful, and in fact it's leading us off in too 
many different directions.


It's hard to find a good description but
> "structural" appears to be the most appropriate so far.

To me, structural doesn't work for "ex:color can be either 'red' 'blue' 
or 'yellow'". Admittedly, the definition of object values is a big focus 
for me.

>
>
>>> SHACL constraints are grouped into conjunctions called "shapes",
>> which may also be referenced by constraints in other shapes.
>>
>> *Is "conjunction" the right word here? It doesn't match the
>> grammatical use of this term. Union?
>
> Struck " conjunctions called".
>
>
>>> These constraints restrict the predicates of triples connecting
>> nodes in the graph.
>>
>> *This really confuses me -- how do the constraints restrict the
>> predicates? I mean, we do have min/max that can be applied to
>> predicates, so it's the "these constraints" that doesn't work for me
>> here.
>
> now:
> [[
> These constraints restrict the triples connecting certain nodes in the
> graph.  SHACL can restrict the number of triples with a particular
> predicate and the permitted object datatype or object terms, require
> that the subject or object match some shape or lexical and datatype
> conditions.
> ]]

No idea what this means. First, "constraints" don't do any restricting, 
at least not the SHACL constraints. SHACL is a constraint language that 
defines rules, and some SHACL implementation will do the restricting. 
Again, this leaps from the language to the implementation, skipping over 
the fact that SHACL defines rules. I don't think this leap aids in 
understanding.

Let's look at how SPARQL explains itself:

"... the SPARQL 1.1 Query Language can be used to formulate queries..."

The SPARQL query language does not *query* -- it is a language used to 
formulate queries. Analogously, SHACL is a language used to formulate 
constraints.

Let's start from there and take the introduction:

***was***
SHACL (Shapes Constraint Language) provides structural constraints for 
RDF graphs. SHACL constraints are grouped into conjunctions called 
"shapes", which may also be referenced by constraints in other shapes. 
These constraints restrict the predicates of triples connecting nodes in 
the graph. SHACL can restrict the number of these triples and the 
permitted object datatype or object terms, require that the subject or 
object match some shape or lexical and datatype conditions.

***suggested***
SHACL (Shapes Constraint Language) can be used to formulate constraints 
over RDF graphs. This specification defines the syntax and semantics of 
the SHACL constraint language. SHACL can be used to express constraints 
to be applied to (?better word? imposed on? validated against?) a 
defined set of triples. The focus set of triples is called a "shape." 
The constraint language can define rules over the predicates of the 
specified shape and over object values of those predicates.
***endSuggested***

I don't consider my wording to be perfect by any means, but I think it 
conveys a substantial difference between the constraint language and the 
*action* of an implementation of the constraint language.

kc

>
>
>>> SHACL can restrict the number of these triples
>>
>> *I don't recall (but may not have read carefully) any discussion of
>> restricting numbers of triples, unless you are referring to min/max?
>
> yes.
>
>
>>> and the permitted object datatype or object terms, require that
>> the subject or object match some shape or lexical and datatype
>> conditions.
>>
>> *these lexical and datatype conditions are what make the "structural
>> constraints" above untrue.
>
> Since datatypes are part of the RDF graph structure, I think that
> "structural" is still better than any proposed alternative.
>
>
>
>> Perhaps if we could develop a good definition of SHACL, other things
>> could flow from it. I think these are the key areas that we need to
>> define:
>>
>> - SHACL defines structures of RDF graphs in terms of focus nodes and
>> member predicates, and values for objects
>
> I suspect we don't want to talk about focus nodes in the abstract, but
> I did add "triples connecting certain nodes". Does that seem like a
> reasonable compromise between gentleness, brevity and accuracy?
>
>
>> - SHACL definitions can be used as constraints for validation of RDF graphs
>> - SHACL provides a closed-world semantics over RDF graphs
>> - ?? more?
>>
>> Next, I think the document needs to define a focus shape[1], and the
>> remainder of the constraints need to be described in relation to a
>> focus shape. For example:
>>
>>> 3.1 Property Constraint eval
>>
>> A property constraint has a predicate which identifies the triple's
>> predicate and may have a minimum cardinality and maximum
>> cardinality, to indicate how many triples with that predicate are
>> expected.
>>
>> *"to indicate how many triples... are expected..." -> within that
>> focus shape?
>
> Hmm, the shape is composed (eventually) of property
> constraints. Really we're trying to say that triples on the focus node
> are matched against algebraic combinations of property constraints in
> a shape. Does that seem like an improvement?
>
>
>> kc
>> [1] We seem to have an idea of where a focus shape starts, but not
>> where it ends. This may relate to Peter's questions about recursion,
>> but I'm not sure.
>>
>>
>> --
>> Karen Coyle
>> kcoyle@kcoyle.net http://kcoyle.net
>> m: 1-510-435-8234
>> skype: kcoylenet/+1-510-984-3600
>>
>

-- 
Karen Coyle
kcoyle@kcoyle.net http://kcoyle.net
m: 1-510-435-8234
skype: kcoylenet/+1-510-984-3600
Received on Sunday, 15 March 2015 16:11:51 UTC