Re: recursive shapes in shape expressions from Iovka Boneva on 2015-02-23 (public-data-shapes-wg@w3.org from February 2015)

From: Iovka Boneva <iovka.boneva@univ-lille1.fr>
Date: Mon, 23 Feb 2015 16:25:07 +0100
To: RDF Data Shapes Working Group <public-data-shapes-wg@w3.org>
Message-ID: <54EB4653.7050406@univ-lille1.fr>
In our ICDT 2015 paper we have defined fully declarative semantics for 
Shape Expressions, in which recursion is well defined, and does not 
represent any particular difficulty regarding semantics.
http://www.grappa.univ-lille3.fr/~staworko/papers/staworko-icdt15a.pdf

To say it briefly, in our semantics, a 'valid typing' consists in 
associating a set of shapes with every node, s.t. the constraints of all 
the shapes associated with all nodes are satisfied.

There may exist several valid typings. It is unavoidable, since shapes 
have been thought for validating starting from some selected root nodes, 
and it is obvious that depending on the root nodes and their required 
types, the final typings can differ.

We however show that there exists a *unique maximal valid typing*, that 
associates to every node as many shapes as possible. Uniqueness is very 
interesting property, as it allows to define the /coverage/ of a shape 
in a sound manner. Note that there does not exist a *unique* minimal 
valid typing, but some minimal valid typing can be obtained in some 
situations, that we identify.

Here are the valid shapes for your examples, with our semantics.

Le sam. 21 févr. 2015 23:27:52 CET, Peter F. Patel-Schneider a écrit :
>
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> Shape Expressions includes recursive shapes, such as
> <IssueShape> {
> :status (:Assigned :Unassigned),
> ... ,
> :related @<IssueShape>*
> }
> Recursive shapes are a prominent feature of Shape Expressions, showing 
> up in
> the introductory material and talks about Shape Expressions. However, 
> there
> appears to be problems in the semantics of recursive shapes in Shape
> Expressions. If these problems cannot be fixed, then recursive shapes may
> have to be removed from Shape Expressions.
>
> Sometimes it is relatively obvious what nodes are have such recursive
> shapes. For example, in
> ex:i1 :related ex:i2 .
> ex:i1 :related ex:i3 .
> ex:i1 :status :Assigned .
> ex:i2 :status :Unassigned .
> ex:i3 :status :Unassigned .
> all of ex:i1, ex:i2, and ex:i3 should have shape <IssueShape>.
>
> Do ex:j1, ex:i2, and ex:i3 have shape <IssueShape> here in Shape
> Expressions? Show how this can be determined in each of the semantics for
> Shape Expressions.
>

There is a valid typing that associates {<IssueShape>} to ex:i1, ex:i2 
and ex:i3.

> However, sometimes it is not obvious what nodes have a particular
> recursive shape. For example, in
> ex:j1 :related ex:j1 .
> ex:j1 :status :Assigned .
> it is reasonable to consider ex:j1 as having shape <IssueShape> but it is
> also reasonable to consider ex:j1 as not having shape <IssueShape>.

> Does ex:j1 have shape <IssueShape> here in Shape Expressions? Show how 
> this
> can be determined in each of the semantics for Shape Expressions.
>
There is a valid typing that associates {<IssueShape>} with ex:j1. 
Indeed, the constraints of <IssueShape> are satisfied in this node.
As a side remark, I do not see why it is reasonable to consider that 
ex:j1 should not have <IssueShape>


> There are ever trickier situations involving recursive shapes. For 
> example,
> consider the two mutually recursive shapes
> <S> { ( ex:p @<S> | ex:p @<T> ) }
> <T> { ( ex:p @<T> | ex:p @<S> ) }
> (something is in <S> if it has an ex:p fillers and either all its ex:p
> fillers are in <S> or its ex:p fillers are in <T> but not both, and
> similarly for <T>). It is unclear as to which nodes should have shape <S>
> or <T> in the graph
> ex:a ex:p ex:b .
> ex:b ex:p ex:a .
>

With our semantics and considering /closed/ shapes, the shape

<S> { ( ex:p @<S> | ex:p @<T> ) }

intuitively says that a <S> typed node should have exactly one outgoing 
ex:p, that leads either to <S> or to <T> node. This however does not 
forbid the target node to have both <S> and <T>; we exclude any form of 
negative constraints.

There are 9 valid typings, associating with ex:a and ex:b, in this 
order, the sets of shapes:
{<S>} {<S>}
{<S>} {<T>}
{<S>} {<S>,<T>}
{<T>} {<S>}
{<T>} {<T>}
etc.
actually all combinations in the Cartesian product { {<S>}, {<T>}, 
{<S>,<T>} } X { {<S>}, {<T>}, {<S>,<T>} }
The unique maximal typing is <S>,<T> for both ex:a and ex:b.


Considering /open/ shapes, the shape
<S> { ( ex:p @<S> | ex:p @<T> ) }
intuitively says that a <S> typed node should have one outgoing ex:p 
that leads to <S> or to <P>, and any number of any other outgoing edges 
(/open/). The  the maximal typing is the same as above.

> Which nodes have which shapes here? Show how this can be determined in 
> each
> of the semantics for Shape Expressions. Are the answers independent of
> changes in the ordering of the disjuncts? Do the answers depend on any
> ordering of the sets involved? Does this match any intuitive notion of
> recursive shapes?
>
Yes, in our semantics, the maximal typing is independent on the ordering 
of the disjuncts. Non maximal typings depend on the order of evaluation, 
and the problem of non uniqueness occurs even without recursion.

I do not know what is your intuition for recursive shapes, but our 
intuition actually does not consider recursion at all. We considered 
that a shape defines *local* constraints, and these constraints might 
envolve that the neighbours of a node satisfy other shapes, that might 
eventually be the same shape as the one being defined. We then have 
recursion while evaluating, but recursion does not influence the 
declarative semantics.

> Here are another couple of situations that should be considered as well.
>
> <A> { ( ex:p1 <B> | ex:p2 <B> ) }
> <B> { ex:q <B> }
> ex:a ex:p1 ex:b1 .
> ex:a ex:p2 ex:b2 .
> ex:b1 ex:q ex:b2 .
> ex:b2 ex:q ex:b1 .
>
With our semantics and /closed/ shapes, ex:a cannot be typed with <A>, 
because <A> requires one among ex:p1 and ex:p2 outgoing edge, but not 
both. With /open/ shapes, the maximal valid typing is
ex:a type {<A>}
ex:b1 and ex:b2 type {<B>}

> <A> { ( ex:p1 <B> | ex:p2 <B> ) }
> <B> { ( ex:q <B> | ex:r { ex:c } ) }
> ex:a ex:p1 ex:b1 .
> ex:a ex:p2 ex:b2 .
> ex:b1 ex:q ex:b2 .
> ex:b1 ex:r ex:c .
> ex:b2 ex:q ex:b1 .
> ex:b2 ex:r ex:c .
>

Again with /closed/ shapes, no valid typing. With /open/ shapes, the 
same typing as above.

Thanks for reading this long explanation. Hope it helps. I would be glad 
to answer if you need to discuss on our semantics.

Iovka





-- 
Iovka Boneva
Associate professor (MdC) Université de Lille
http://www.cristal.univ-lille.fr/~boneva/
+33 6 95 75 70 25
Please note that I read my mails twice a day at 9:00 and 13:00 (CET)
For urgent matters, please contact me by phone
Received on Monday, 23 February 2015 15:29:30 UTC