Re: differing accounts of Shape Expressions

Peter, do you have a suggested solution? - kc

On 2/25/15 10:04 AM, Peter F. Patel-Schneider wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> There are three quite different accounts of Shape Expressions available off
> of http://www.w3.org/2001/sw/wiki/ShEx
>
>
> There is a primer and denotational semantics which appear to be early
> versions of the W3C submission http://www.w3.org/Submission/2014/03/, with
> the Primer by Eric Prud'hommeaux at www.w3.org/Submission/shex-primer/ and
> the definition by Harold Solbrig and Eric Prud'hommeaux at
> www.w3.org/Submission/shex-defn/.  There is also
> https://github.com/w3c/ShEx/blob/master/ShExZ/ShExZ.pdf?raw=true which is
> close to the definition in the submission.  Let's call this SE1.
>
>
> There is "Validating RDF with Shape Expressions" by Iovka Boneva, Jose
> Emilio Labra Gayo, Samuel Hym, Eric G. Prud'hommeau, Harold Solbrig, Sławek
> Staworko at arxiv.org/abs/1404.1270.  A paper with the same treatment is
> "Complexity and Expressiveness of ShEx for RDF" by Sławek Staworko , Iovka
> Boneva, Jose E. Labra Gayo, Samuel Hym, Eric G. Prud’hommeaux, and Harold
> Solbrig at www.grappa.univ-lille3.fr/~staworko/papers/staworko-icdt15a.pdf
> Let's call these SE2.
>
> There is the ShEx/Operational Semantics at
> http://www.w3.org/2001/sw/wiki/ShEx/OperationalSemantics_Operational_semantics
> This appears similar in spirit to an axiomatic semantics for SHACL prepared
> by Jose Emilio Labra Gayo.  Let's call this SE3.
>
>
> SE1:
>
> The syntax of SE1 uses rules like:
> <IssueShape> {
>      ex:state (ex:unassigned ex:assigned),
>      ex:reportedBy @<UserShape>,
>      ex:reportedOn xsd:dateTime,
>      ( ex:reproducedBy @<EmployeeShape>,
>        ex:reproducedOn xsd:dateTime      )?,
>      ex:related @<IssueShape>*
> }
>
> SE1 works on RDF graphs.  SE1 has conjunction, counting, cyclicity,
> exclusive disjunction, and a number of atomic constructs that involve node
> labels.
>
> The semantics of SE1 is given in an extension of Z.  The semantics is based
> on judgements of how well a node in an RDF graph matches a shape in a set of
> SE1 rules.  The semantics of SE1 is inherently an open semantics, i.e.,
> adding irrelevant edges to a graph does not affect results.  A closed
> semantics for SE1 woulld require a complete rewrite, as there is no
> determination of relevancy in the semantics.  The semantics of SE1 requires
> negative judgements, to handle the exclusive disjunction.
>
> SE1 is problematic because it uses an ill-founded extension to Z.  The
> meaning of cyclic references in rules cannot be determined in many
> situations, even if the cycles are all positive.
>
>
> SE2:
>
> SE2 works on labelled-edge graphs, not RDF graphs.  The language of SE2 uses
> rules like:
>     t0 -> a::t1 || (b::t2)[2;4]
> SE1 has disjunction, unordered concatenation, and cyclicity.  The language
> of SE2 is missing many of the features of SE1.
>
> The semantics of SE2 is in part an algebraic semantics, building up bag
> languages.  The other part of the semantics of SE2 is based on assignments.
> The semantics of SE2 is a closed semantics, i.e., all outgoing edges from a
> node have to be matched.  A syntactic transform can be used to open up the
> semantics, adding in universal types to soak up extra edges.
>
> SE2 is problematic because its language is missing many of the features in
> SE1.
>
>
> SE3:
>
> The semantics in SE3 is an axiomatic semantics.  It works on a version of
> RDF graphs.  It appears to be non-additive, in that the matched portions of a
> graph are unioned together, but the account is missing many details.
> Apparently adding the missing details results in a semantics that is
> non-additive, i.e., conjoining something with itself requires two copies of
> whatever is matched by the conjunct.  The semantics also appears to be
> closed, i.e., every edge in the graph has to be used.
>
> SE3 is problematic because there are many missing portions of the
> axiomatization.
>
>
>
> Differences between SE1 and SE2:
>
> SE1 and SE2 are very different, even aside from the differences in surface
> syntax.
>
> SE1 works on RDF graphs; SE2 works on edge-labelled graphs, which don't have
> node labels.
>
> SE1 has conjunction; in
>    ex:a ex:p ex:v .
> with
>    <S0> { ex:p (ex:v) , ex:p (ex:v) }
> ex:a has type <S0>
> SE2 has unordered concatenation, which is not conjunction; in
>    { < a p b > }
> with
>     t0 -> p::e || p::e
>     e ->
> a cannot have type t0.
>
> The basic edge construct in SE1 applies to all the outgoing edges for a
> particular property; in
>    ex:a ex:p ex:v .
>    ex:a ex:p ex:w .
>    ex:a ex:p ex:x .
> with
>    <S0> { ex:p (ex:v ex:w)>{2,3} }
> ex:a does not have type <S0>.
> The basic edge construct in SE2 does not need to apply to all of the
> outgoing edges for a particular property; in
>    { < a p b >, <b q x>, <a p c>, <c r y> }
> with
>    t0 -> (p::t1)+ || (p::t2)+
>    t1 -> q::e
>    t2 -> r::e
>    e ->
> a has type t0.
>
> SE1 can require that nodes have multiple types; in
>    ex:a ex:p ex:c .
>    ex:c ex:q ex:d .
>    ex:c ex:r ex:e .
> with
>    <S0> { ex:p <S1>, ex:p <S2> }
>    <S1> { ex:q ( ex:d ) }
>    <S2> { ex:r ( ex:e ) }
> ex:a having type <S0> requires that ex:c has both type <S1> and <S2>.
>
>
>
>
> Where does SE3 fit:
>
> SE3 appears to be an axiomatic semantics for the language of SE1 but it is
> hard to figure out just what is going on.
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1
>
> iQEcBAEBAgAGBQJU7g60AAoJECjN6+QThfjzxBAH/1/CpTLE/ZCIbs/J+pdf4DA6
> +9tjYRjK+IXtJzSSu8gBilzpaCxlUo7/bP1iwsIxCUQ5EZ+V3vMDYgegivmPfzeb
> +rKx0S3dv30iBKpgmDk0yagErL+/vz//h6y51aG0Uwi/S0TYyB+BuwRuiad5+HXk
> 96x9E+L6d1tRKSYnzOeP7fiKqz1BaZ8yASiyYzPWcc+7rN457eofACXhu3anQeuz
> 4lHMILUCLw3IFfhJSFFNYYFgeam8jTSl5XRbkLMCMi4Ergle5sd1VKNCY0/muHCg
> zQMBXseISexrz83XDHQhty7Ju7x7UY5p8sTqrYkRLBDw76CFy6ocEsSSqMeuq0Q=
> =gafT
> -----END PGP SIGNATURE-----
>
>

-- 
Karen Coyle
kcoyle@kcoyle.net http://kcoyle.net
m: 1-510-435-8234
skype: kcoylenet/+1-510-984-3600

Received on Wednesday, 25 February 2015 18:44:20 UTC