Re: recursive shapes in shape expressions from Iovka Boneva on 2015-02-25 (public-data-shapes-wg@w3.org from February 2015)

From: Iovka Boneva <iovka.boneva@univ-lille1.fr>
Date: Wed, 25 Feb 2015 10:39:56 +0100
To: RDF Data Shapes Working Group <public-data-shapes-wg@w3.org>
Message-ID: <54ED986C.3040201@univ-lille1.fr>
Hi,

Thank you Peter for carefully considering our semantics.

Here are comment on your last questions, some of the questions were 
answered in my previous mail.


Le 24/02/2015 18:24, Peter F. Patel-Schneider a écrit :
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> On 02/23/2015 07:25 AM, Iovka Boneva wrote:
>> In our ICDT 2015 paper we have defined fully declarative semantics for
>> Shape Expressions, in which recursion is well defined, and does not
>> represent any particular difficulty regarding semantics.
>> http://www.grappa.univ-lille3.fr/~staworko/papers/staworko-icdt15a.pdf
>>
>> To say it briefly, in our semantics, a 'valid typing' consists in
>> associating a set of shapes with every node, s.t. the constraints of all
>> the shapes associated with all nodes are satisfied.
>>
>> There may exist several valid typings. It is unavoidable, since shapes
>> have been thought for validating starting from some selected root nodes,
>> and it is obvious that depending on the root nodes and their required
>> types, the final typings can differ.
> I guess that you mean that if the typing is forced for some node, then the
> typings of other nodes may be affected.
Indeed.
>> We however show that there exists a *unique maximal valid typing*, that
>> associates to every node as many shapes as possible. Uniqueness is very
>> interesting property, as it allows to define the /coverage/ of a shape
>> in a sound manner.
> I didn't see a definition of coverage in
> http://www.grappa.univ-lille3.fr/~staworko/papers/staworko-icdt15a.pdf
> I'm not even sure what coverage would mean.
You are right, this definition is not in the paper. Unfortunately in a 
conference paper we do not have place to develop all the ideas, but 
these are thinks we thought of.

The word coverage was used during the F2F meeting last week. Coverage 
could be defined like this: Given a graph G, a set of root nodes with 
required shapes for these roots, and a shape schema S (set of shapes), 
the /coverage/ is the subgraph G' of G being visited while validating G 
with S starting with a typing that associates to the roots their 
required shapes. This is intuitively the part of the graph that could 
ever be visited by an application 'guided' by S, i.e. an application 
that only knows how to treat the shapes defined in S. Computing the 
coverage appears as a user requirement by Harold, and other also 
mentioned the possibility to actually compute and materialize G'.

On my opinion, it is necessary for G' to be well defined, and to be 
unique (for instance, not depending on some order of validation, etc.) 
This is what our semantics allows. Considering using deterministic 
shapes and the universal type (thus open shapes) as defined on p. 23, G' 
is the sub-graph of G induced by the nodes that have a type different 
from the universal type (after computing with Algorithm 1 p. 24). By 
Lemma 8.6, G' is unique. Note that determinism is required here 
(otherwise impossible to ensure uniqueness). Note also that the notion 
of determinism can be weakened, and we already have ideas for that, but 
not developed in the paper because of lack of space.

>
>> Note that there does not exist a *unique* minimal valid typing, but some
>> minimal valid typing can be obtained in some situations, that we
>> identify.
>> Here are the valid shapes for your examples, with our semantics.
>>
>> Le sam. 21 févr. 2015 23:27:52 CET, Peter F. Patel-Schneider a écrit :
>>> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
>>>
>>> Shape Expressions includes recursive shapes, such as <IssueShape> {
>>> :status (:Assigned :Unassigned), ... , :related @<IssueShape>* }
>>> Recursive shapes are a prominent feature of Shape Expressions, showing
>>> up in the introductory material and talks about Shape Expressions.
>>> However, there appears to be problems in the semantics of recursive
>>> shapes in Shape Expressions. If these problems cannot be fixed, then
>>> recursive shapes may have to be removed from Shape Expressions.
>>>
>>> Sometimes it is relatively obvious what nodes are have such recursive
>>> shapes. For example, in ex:i1 :related ex:i2 . ex:i1 :related ex:i3 .
>>> ex:i1 :status :Assigned . ex:i2 :status :Unassigned . ex:i3 :status
>>> :Unassigned . all of ex:i1, ex:i2, and ex:i3 should have shape
>>> <IssueShape>.
>>>
>>> Do ex:j1, ex:i2, and ex:i3 have shape <IssueShape> here in Shape
>>> Expressions? Show how this can be determined in each of the semantics
>>> for Shape Expressions.
>>>
>> There is a valid typing that associates {<IssueShape>} to ex:i1, ex:i2
>> and
> ex:i3.
>>> However, sometimes it is not obvious what nodes have a particular
>>> recursive shape. For example, in ex:j1 :related ex:j1 . ex:j1 :status
>>> :Assigned . it is reasonable to consider ex:j1 as having shape
>>> <IssueShape> but it is also reasonable to consider ex:j1 as not having
>>> shape <IssueShape>.
>>> Does ex:j1 have shape <IssueShape> here in Shape Expressions? Show how
>>> this can be determined in each of the semantics for Shape Expressions.
>> There is a valid typing that associates {<IssueShape>} with ex:j1.
>> Indeed, the constraints of <IssueShape> are satisfied in this node. As a
>> side remark, I do not see why it is reasonable to consider that ex:j1
>> should not have <IssueShape>
> Your semantics shows why this is reasonable.  In your semantics it is be
> permissable to have ex:j1 not typed with <IssueShape> if there is another
> type to which ex:j1 could belong.
I'm still not sure to understand, but let me comment, maybe I answer the 
question.

  With the unique maximal type, we associate to every node all the types 
that it can have. In this particular case, ex:j1 satisfies the 
constraints from <IssueShape>, so it will have that type in the maximal 
typing.
If we now consider starting with pre-typing of root nodes, then the aim 
is to ensure that the roots satisfy the required shapes, and in that 
case it might not be necessary to associate <IssueShape> with ex:j1.

Recall that typing is not unique, so indeed we can have valid typings in 
which ex:j1 has <IssueShape> and valid typings in which it has not. 
However, there is a unique maximal typing, and a unique minimal typing 
if a pre-typing is given. And that's really a nice property !

>>> There are ever trickier situations involving recursive shapes. For
>>> example, consider the two mutually recursive shapes <S> { ( ex:p @<S>
>>> | ex:p @<T> ) } <T> { ( ex:p @<T> | ex:p @<S> ) } (something is in <S>
>>> if it has an ex:p fillers and either all its ex:p fillers are in <S>
>>> or its ex:p fillers are in <T> but not both, and similarly for <T>). It
>>> is unclear as to which nodes should have shape <S> or <T> in the graph
>>>   ex:a ex:p ex:b . ex:b ex:p ex:a .
>>>
>> With our semantics and considering /closed/ shapes,
> As far as I can tell, the semantics in
> http://www.grappa.univ-lille3.fr/~staworko/papers/staworko-icdt15a.pdf
> only handles closed shapes.  Is there a treatment of open shapes with this
> semantics?
See previous mail, p.23 of the long version http://arxiv.org/pdf/1404.1270
or also p. 12 of the short version 
http://www.grappa.univ-lille3.fr/~staworko/papers/staworko-icdt15a.pdf
>
>> the shape
>>
>> <S> { ( ex:p @<S> | ex:p @<T> ) }
>>
>> intuitively says that a <S> typed node should have exactly one outgoing
>> ex:p, that leads either to <S> or to <T> node. This however does not
>> forbid the target node to have both <S> and <T>; we exclude any form of
>> negative constraints.
> This is the semantics in
> http://www.grappa.univ-lille3.fr/~staworko/papers/staworko-icdt15a.pdf
> The other semantics for Shape Expressions appear to be different.
>
>> There are 9 valid typings, associating with ex:a and ex:b, in this order,
>> the sets of shapes: {<S>} {<S>} {<S>} {<T>} {<S>} {<S>,<T>} {<T>} {<S>}
>> {<T>} {<T>} etc. actually all combinations in the Cartesian product {
>> {<S>}, {<T>}, {<S>,<T>} } X { {<S>}, {<T>}, {<S>,<T>} } The unique
>> maximal typing is <S>,<T> for both ex:a and ex:b.
>>
>>
>> Considering /open/ shapes, the shape <S> { ( ex:p @<S> | ex:p @<T> ) }
>> intuitively says that a <S> typed node should have one outgoing ex:p that
>> leads to <S> or to <P>, and any number of any other outgoing edges
>> (/open/). The  the maximal typing is the same as above.
> Again, where is the treatment of open shapes with the semantics in
> http://www.grappa.univ-lille3.fr/~staworko/papers/staworko-icdt15a.pdf
Universal types, see previous mail.
>
>>> Which nodes have which shapes here? Show how this can be determined in
>>> each of the semantics for Shape Expressions. Are the answers
>>> independent of changes in the ordering of the disjuncts? Do the
>>> answers depend on any ordering of the sets involved? Does this match
>>> any intuitive notion of recursive shapes?
>> Yes, in our semantics, the maximal typing is independent on the ordering
>> of the disjuncts. Non maximal typings depend on the order of evaluation,
>> and the problem of non uniqueness occurs even without recursion.
>>
>> I do not know what is your intuition for recursive shapes, but our
>> intuition actually does not consider recursion at all. We considered
>> that a shape defines *local* constraints, and these constraints might
>> envolve that the neighbours of a node satisfy other shapes, that might
>> eventually be the same shape as the one being defined.  We then have
>> recursion while evaluating, but recursion does not influence the
>> declarative semantics.
> [...]
>
>> Thanks for reading this long explanation. Hope it helps. I would be glad
>>   to answer if you need to discuss on our semantics.
> Well, I think that I do understand the semantics in
> http://www.grappa.univ-lille3.fr/~staworko/papers/staworko-icdt15a.pdf
> at least with respect to closed types.
>
> I'm not sure that the extension of the language in
> http://www.grappa.univ-lille3.fr/~staworko/papers/staworko-icdt15a.pdf
> to the language in the W3C Shape Expressions submission is easy.
I think it is completely doable, possibly with very reasonable 
compromises on both sides.

Iovka
Received on Wednesday, 25 February 2015 09:40:49 UTC