Inheritance and inclusion of types from Sławek Staworko on 2014-02-10 (public-rdf-shapes@w3.org from February 2014)

From: Sławek Staworko <slawomir.staworko@inria.fr>
Date: Mon, 10 Feb 2014 18:39:12 +0100
To: "public-rdf-shapes@w3.org" <public-rdf-shapes@w3.org>
Message-Id: <EB9781FB-22AF-4CBF-9983-BFA633570501@inria.fr>
Hi guys,

With Samuel (and Iovka), we’ve been working on some clear and intuitive notions for type inheritance and inclusion. 

First of all, we believe that since we want to use OO mechanisms (inheritance), it is important to respect certain facts that we came to expect on a very intuitive level about polymorphic types. For instance, if type <C> extends type <A>, then any object that is of type <C> should also be of type <A>.

1) We propose two kinds of types, open and closed. Closed types define precisely the contents of a node while open types allow additional contents. To make a type open, you add “..” (two dots) at the end of the definition, an intuitive notation borrowed from O’caml’s row types. For example, 

<A> {
   :a xsd:string
   ..
}

<B> {
   :a xsd:string
}

<A> is an open type: any node that has a property “a” and any other property (other than “a”) are of type <A>. On the other hand <B> is a closed type: only node having exactly one property “a” and no other property will match its definition.

2) There are two mechanisms for using one (or more) types as the basis of another type: inheritance and inclusion. 

2.a) Inheritance. Only open types can be used as the base for inheritance. The newly defined type may be closed or open.
<C> & <A> {
   :c xsd:string
}

<C> is a closed type that matches any node with two properties “a” and “c” only. 
Note: The openness requirement ensures that any node having type <C> is also of type <A>. On the other hand, extending a closed type would lead to a type system that does not follow our intuitions. 

2.b) Inclusion. Only closed types can be used as the base for inclusion. The newly defined type may be closed or open. 

<D> {
  &<B>,
  :d xsd:string
}

<D> is a closed type that matches any node having the properties “a” and “d.” 

Note that inclusion of an open type is essentially equivalent to inheritance 
<E> {
   &<A>,
   :e xsd:string
}

is the same as
 
<E> & <A> {
   :e xsd:string
}

Therefore, it does not need to be prohibited but simply sniffed at… and a special sophisticated cleanup tool, ShLintEx®, will replace such confused definitions with correct ones ;)

2.c) Single-occurrence Shape Expressions. For reasons of unambiguity (and complexity) of validating with Shape Expressions, we insist on allowing single-occurrence expressions only i.e., expressions using a property name exactly once. For instance, 
<F> {
  :a xsd:string*,
  :b xsd:string,
}

<G> {
   :a xsd:string,
   :a xsd:int
}

<F> is single-occurrence but <G> is not because the property “a” is used twice. 

While in a number of cases it may still be possible to validate efficiently with Shape Expressions that are not single-occurrence, once we consider actions, the inherent ambiguity becomes troublesome. For instance,

<H> {
    :a @<A> @lang{boom($node)},
    :a @<A> @lang{bam($node)}
}

When validating a node with two properties “a” against the type <H>, it is not clear which action should be executed on which node.

The single-occurrence requirements has a number of ramifications on types that are constructed with inheritance and inclusion: The newly constructed types cannot employ properties used on their path of inheritance-inclusion in the inheritance-inclusion graph. The main reason is unambiguity of parsing. Additionally, such repetition could be confused for an attempt at specialisation

<I> {
   :a xsd:string?
   ..
}

<J> & <I> {
   :a xsd:string
}

What is the meaning of type <J>? Exactly one property “a” or rather one or two properties “a” are allowed?

In a view consistent with the single-occurrence requirement, the “..” can be seen as a wildcard that matches all property names except for those used in the Shape Expression. Consequently, if we go back to the definition

<A> {
   :a xsd:string
   ..
}

then a node having two properties “a” does not satisfy the type <A>. 

3) Unambiguous Shape Expressions. The single-occurrence requirement seems, however, too strict. For instance, the expression

<K> {
  ((:a xsd:string,
    :b xsd:string*,
    :c xsd:string) @lang{boom($nodes)} 
  |
   (:a xsd:string+,
    :d xsd:string) @lang{bam($nodes)}
  )
}

is not single-occurrence but disambiguating it is not problematic: the presence (and absence) of properties “c” and “d” allows to choose the right branch of execution. We are currently working on formalising a simple requirement that implies unambiguity and generalises single-occurrence. 

Please, let us know what you think about it. 

Best,

Sławek and Sam (and Iovka)
Received on Tuesday, 11 February 2014 08:06:05 UTC