Re: preliminary analysis of shape expressions proposal

Hash: SHA256

> Dear Peter, all,
> Thank you for your comments, here are my answers.
> Le 01/05/2015 15:13, Peter F. Patel-Schneider a écrit :
>> I spent some time yesterday and today looking over the Core SHACL
>> Semantics document
>> This is a new formal basis for a shapes/constraint language.
>> The language (in Section 2) is quite limited.  It includes closed
>> shape expressions and open shape expressions.  It has disjunction and a
>> construct called grouping.  It has cardinality ranges.  It has values
>> and node kinds and datatyping and facets.  It has inverse properties
>> (which are not handled in the semantics).  That's it.
> Indeed, that's it! Because these few constructs allow to express complex
> constraints on the neighbourhood of a node, we didn't consider it
> necessary to add anything else for now. By the way, the unique difficulty
> in integrating inverse constraints is that it makes the definitions more
> verbose, handling two distinct cases each time. We added explanations how
> it works, to be found in the actual version of the document.

A few more things have been added since my initial analysis, including
inverse triple constraints and a partial treatment of the association of
data with shapes, but there is still no negation, no conjunction, no
keys, no transitive traversal, and no extensibility.  That's quite a bit
of stuff that is not handled.

>> In particular there is no conjunction of shape expressions, and adding
>> this conjunction would break a property of the semantics.
> Adding conjunction would not break anything in the semantics.  We
> already have conjunction of types in triple constraints. It is actually
> very easy to add general conjunction as construct of shape expressions.
> If a user case justifies the need of conjunction (i.e. cannot be handled
> with what is already in the language), then conjunction is a an evolution
> to be considered.

Adding a normal conjunction would break the property that each triple in
Neigh is introduced exactly once in a proof tree.

[Section eliminated. See previous message for my response.]

>> There are some minor problems with the proposal.
>> The language (Section 2) is very ambiguous.  The productions for 
>> disjunction and grouping need to be modified to require them to be 
>> disjunctions or groupings, respectively.
> I do not understand this remark. Does this relate to the abstract syntax
> ?

Yes it does.  In the abstract syntax a simple shape expression can be
just a shape expression, or it can be a disjunctive shape, or it can be
a group shape, or a disjunctive shape made up of a group shape, or lots
of other possibilities.   There are probably no ill-effects of this

>> The abstract syntax used in Section 5 is different from that in Section
>> 2.
>> There are some broken parts of the proposal.
>> There is a claim that disjunctive shape expressions are similar to
>> logical disjunction.  However disjunctive shape expressions here end up
>> being quite different from logical disjunction.  See below for more on
>> this.
>> There are some serious problems with the proposal
>> There is no connection between how shapes are associated with data
>> (Section 4) and the main part of the semantics (Section 5).
> Fixed: 2 lines in the end of the document.

That covers only part of Section 4, which itself only covers two simple
ways to associate data with shapes.

>> The semantics in Section 5 does not cover the full language from
>> Section 2. It is missing inverse triple constraints.  Adding inverse
>> triple constraints to the semantics appears to be problematic.
>> The semantics depends on witness mappings.  The definition of witness 
>> mappings talks about triples appearing in the conclusion of
>> applications of the rule for the empty shape expression.  However,
>> applications of this rule do not include any triples at all.  I think
>> that there is a simple fix---using the rule for triple constraints
>> instead---but that remains to be verified.
> I do not think there is a problem with the empty shape. Actually, I am 
> not sure I understand your remark. In any case, a non-empty set of 
> triples does not satisfy the empty rule.

You are misunderstanding the error that I pointed out.  The document
  Given such proof tree, it can be shown that every triple (n, p, u) in
  Neigh appears in the conclusion of exactly one application of
However, rule-empty does not involve triples at all so its applications
will not have any triples in their conclusion.

>> Disjunctive shape expressions
>> Disjunctive shape expressions work differently from what one might
>> expect.
> Unfortunately, users' expectations are subjective, and differ depending 
> on the user's experience. Our proposal is closer to the intuition of 
> users of structural schemas (like XMLSchema, DTD, ...). By giving precise
> semantics of a language, one allows to users to agree on a shared
> meaning, rather than relying on their personal expectations. Let's focus
> more on the capability of the language to meet the use cases, and less on
> the (necessary subjective) expectations.

Well I would say that disjunction is more closely associated with
standard propositional, predicate, and modal logic, and disjunctive
shapes here work differently than disjunction in these logics.  I agree
that there needs to be a precise semantics to support language features,
but I also believe that the semantics should generally correspond with
the expectations of users.   All I was doing was pointing out that there
is a potential mismatch between user expectations and the semantics of
disjunction here.

> Follows a more technical explanation trying to give a better intuition of
> our disjunction. Stating that disjunction is similar to logical 
> disjunction probably gives a wrong intuition. It would have been more 
> correct to say that disjunction is similar to disjunction in regular 
> languages, or to choice in XMLSchema which is a restriction of the 
> former. The difference with the choice in XMLSchema is that, for defining
> the semantics, we do not require something like the Unique Particle
> Attribution (UPA) of XMLSchema (which makes that our disjunction is not
> an exclusive choice, but real disjunction like in regular languages).
> Restrictions like UPA are useful for efficient validation, but not for
> defining a proper semantics.

I'm not in favour of calling alternation in regular expressions
disjunction, partly because it can lead to this mismatch of

What I find most disturbing about | here is how it interacts with open
shapes.  I find the results very unintuitive.  I find the interaction
between open shapes and triple matching similarly unintuitive.

>> For example, there is no valid typing for the shape expression schema S
>> open { p , q } p E [1;1] | q E [1;1] E open { } empty on the graph a p
>> b . a q c . that includes S in the shapes of a.
>> Open shape expressions
>> Open shape expressions work differently from what one might expect.
>> For example, there is a valid typing for the shape expression schema S
>> open { p } p { b } [1;1] on the graph a p b . a p c . that includes S
>> in the shapes for a.  However, there is no such valid typing for either
>> the shape expression schema S open { p } p { b , c } [1;1] or the shape
>> expression schema S open { p } p E [1;1] E open { } empty
Version: GnuPG v2


Received on Wednesday, 6 May 2015 05:35:13 UTC