Re: preliminary analysis of shape expressions proposal

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256

> Dear Peter, all,
> 
> Thank you for your comments, here are my answers.
> 
> Le 01/05/2015 15:13, Peter F. Patel-Schneider a écrit :
>> 
>> I spent some time yesterday and today looking over the Core SHACL
>> Semantics document http://w3c.github.io/data-shapes/semantics/
>> 
>> This is a new formal basis for a shapes/constraint language.
>> 
>> The language (in Section 2) is quite limited.  It includes closed
>> shape expressions and open shape expressions.  It has disjunction and a
>> construct called grouping.  It has cardinality ranges.  It has values
>> and node kinds and datatyping and facets.  It has inverse properties
>> (which are not handled in the semantics).  That's it.
> 
> Indeed, that's it! Because these few constructs allow to express complex
> constraints on the neighbourhood of a node, we didn't consider it
> necessary to add anything else for now. By the way, the unique difficulty
> in integrating inverse constraints is that it makes the definitions more
> verbose, handling two distinct cases each time. We added explanations how
> it works, to be found in the actual version of the document.

A few more things have been added since my initial analysis, including
inverse triple constraints and a partial treatment of the association of
data with shapes, but there is still no negation, no conjunction, no
keys, no transitive traversal, and no extensibility.  That's quite a bit
of stuff that is not handled.

>> In particular there is no conjunction of shape expressions, and adding
>> this conjunction would break a property of the semantics.
> 
> Adding conjunction would not break anything in the semantics.  We
> already have conjunction of types in triple constraints. It is actually
> very easy to add general conjunction as construct of shape expressions.
> If a user case justifies the need of conjunction (i.e. cannot be handled
> with what is already in the language), then conjunction is a an evolution
> to be considered.

Adding a normal conjunction would break the property that each triple in
Neigh is introduced exactly once in a proof tree.

[Section eliminated. See previous message for my response.]

>> There are some minor problems with the proposal.
>> 
>> The language (Section 2) is very ambiguous.  The productions for 
>> disjunction and grouping need to be modified to require them to be 
>> disjunctions or groupings, respectively.
> 
> I do not understand this remark. Does this relate to the abstract syntax
> ?

Yes it does.  In the abstract syntax a simple shape expression can be
just a shape expression, or it can be a disjunctive shape, or it can be
a group shape, or a disjunctive shape made up of a group shape, or lots
of other possibilities.   There are probably no ill-effects of this
ambiguity.

>> The abstract syntax used in Section 5 is different from that in Section
>> 2.
>> 
>> 
>> There are some broken parts of the proposal.
>> 
>> There is a claim that disjunctive shape expressions are similar to
>> logical disjunction.  However disjunctive shape expressions here end up
>> being quite different from logical disjunction.  See below for more on
>> this.
>> 
>> 
>> There are some serious problems with the proposal
>> 
>> There is no connection between how shapes are associated with data
>> (Section 4) and the main part of the semantics (Section 5).
> 
> Fixed: 2 lines in the end of the document.

That covers only part of Section 4, which itself only covers two simple
ways to associate data with shapes.

>> The semantics in Section 5 does not cover the full language from
>> Section 2. It is missing inverse triple constraints.  Adding inverse
>> triple constraints to the semantics appears to be problematic.
>> 
>> The semantics depends on witness mappings.  The definition of witness 
>> mappings talks about triples appearing in the conclusion of
>> applications of the rule for the empty shape expression.  However,
>> applications of this rule do not include any triples at all.  I think
>> that there is a simple fix---using the rule for triple constraints
>> instead---but that remains to be verified.
> 
> I do not think there is a problem with the empty shape. Actually, I am 
> not sure I understand your remark. In any case, a non-empty set of 
> triples does not satisfy the empty rule.

You are misunderstanding the error that I pointed out.  The document
states:
  Given such proof tree, it can be shown that every triple (n, p, u) in
  Neigh appears in the conclusion of exactly one application of
  rule-empty.
However, rule-empty does not involve triples at all so its applications
will not have any triples in their conclusion.

>> Disjunctive shape expressions
>> 
>> Disjunctive shape expressions work differently from what one might
>> expect.
> 
> Unfortunately, users' expectations are subjective, and differ depending 
> on the user's experience. Our proposal is closer to the intuition of 
> users of structural schemas (like XMLSchema, DTD, ...). By giving precise
> semantics of a language, one allows to users to agree on a shared
> meaning, rather than relying on their personal expectations. Let's focus
> more on the capability of the language to meet the use cases, and less on
> the (necessary subjective) expectations.

Well I would say that disjunction is more closely associated with
standard propositional, predicate, and modal logic, and disjunctive
shapes here work differently than disjunction in these logics.  I agree
that there needs to be a precise semantics to support language features,
but I also believe that the semantics should generally correspond with
the expectations of users.   All I was doing was pointing out that there
is a potential mismatch between user expectations and the semantics of
disjunction here.

> Follows a more technical explanation trying to give a better intuition of
> our disjunction. Stating that disjunction is similar to logical 
> disjunction probably gives a wrong intuition. It would have been more 
> correct to say that disjunction is similar to disjunction in regular 
> languages, or to choice in XMLSchema which is a restriction of the 
> former. The difference with the choice in XMLSchema is that, for defining
> the semantics, we do not require something like the Unique Particle
> Attribution (UPA) of XMLSchema (which makes that our disjunction is not
> an exclusive choice, but real disjunction like in regular languages).
> Restrictions like UPA are useful for efficient validation, but not for
> defining a proper semantics.

I'm not in favour of calling alternation in regular expressions
disjunction, partly because it can lead to this mismatch of
expectations.


What I find most disturbing about | here is how it interacts with open
shapes.  I find the results very unintuitive.  I find the interaction
between open shapes and triple matching similarly unintuitive.

>> For example, there is no valid typing for the shape expression schema S
>> open { p , q } p E [1;1] | q E [1;1] E open { } empty on the graph a p
>> b . a q c . that includes S in the shapes of a.
>> 
>> 
>> Open shape expressions
>> 
>> Open shape expressions work differently from what one might expect.
>> 
>> For example, there is a valid typing for the shape expression schema S
>> open { p } p { b } [1;1] on the graph a p b . a p c . that includes S
>> in the shapes for a.  However, there is no such valid typing for either
>> the shape expression schema S open { p } p { b , c } [1;1] or the shape
>> expression schema S open { p } p E [1;1] E open { } empty
> 
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2

iQEcBAEBCAAGBQJVSafxAAoJECjN6+QThfjzDNAIANW6ETWeEYoNciZSFJPE0B5B
GEs8SV6gI0hbd33yFINAGYVOfdeqH7HpjjwpkojWM4GGolCYwf51MU6ry6lnnMQt
lMTlPYK+qv0G5Wx+0QeeXfm/tR3D9slkEaDnYfwLy7YknXjCJKFKauWm5ccOhtRo
SgjfPuOzS25i/cQq+cdZ+OBPlLXiDL9ksW4LrshQtvj2k3z1s5u761KXT3ixnqY5
rLXFHq+emb5VLFhV7v8jXS9vlm4vyqI/+SFySMYKxZ3U6XO7OqcIMSzJ16O6LbGm
I6sPVHv7lKvkiB5Y9gY1w0w2c8jJi+2hAY/nsupF65YKXAXScPuGrMTwnRFJ2wQ=
=nhuP
-----END PGP SIGNATURE-----

Received on Wednesday, 6 May 2015 05:35:13 UTC