CONSTRAINTS: A proposal for RDF Data Shapes

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

[Just to make it clear that CONSTRAINTS is a proposal, this is an extended
version of the initial message with an explicit subject.]


CONSTRAINTS (Class-based Or Neutral SpecificaTion of Rdf And lINked daTa
Shapes) defines shapes and controls how shapes (local and global) are
evaluated.  In CONSTRAINTS, a node in an RDF graph (by default the default
graph) in an RDF dataset satisfies or violates a local shape.  In
CONSTRAINTS, an RDF graph in an RDF dataset satisfies or violates a global
shape.  There are the only two possibilities that can result from such
evaluations---either the shape is satisfied or it is violated.  It is not
possible for a particular shape to be neither satisfied nor violated or both
satisfied and violated.

NOTE:  A violation may return other information to be used in reporting the
error to downstream components or to users.  Mechanisms for this reporting
are not yet part of this proposal.

A CONSTRAINTS document (which would normally be in the form of an RDF graph)
contains global shapes and local shapes as well as control information.  In
an RDF encoding of CONSTRAINTS, global shapes are written as
  <globalshape> rdf:type constraints:GlobalShape .
Control information takes the form of
1/ node-based CONSTRAINTS links that associate IRIs with local shapes, in an
RDF encoding as
  <IRI> constraints:nodeShape <localshape> .
2/ class-based CONSTRAINTS links that associate non-datatype RDFS classes
with local shapes, in an RDF encoding as
  <IRI> constraints:classShape <localshape> .
and
3/ inter-shape CONSTRAINTS links that associate two shapes---a scope local
shape and a validation local shape, in an RDF encoding as
  <localshape> constraints:shape <localshape>.

A CONSTRAINTS document is satisfied by an RDF graph in an RDF dataset
precisely when
1/ all the global shapes in the document are satisfied by the RDF graph,
2/ for all node-based CONSTRAINTS links the IRI is a node in the
   graph and that node satisfies the local shape,
3/ for all class-based CONSTRAINTS links all nodes that are RDFS instances
   of the class (technically, all nodes for which their membership in the
   class is an RDFS entailment of the graph) also satisfy the local
   shape over the graph, and
4/ for all inter-shape links all nodes that satisfy the scope shape over the
   graph also satisfy the validation shape over the graph.

NOTE: If there is a way of constructing a shape that is precisely satisfied
by a given node then node-based control can be transformed into
inter-shape control.  If there is way of constructing a shape that is
satisfied precisely by the nodes that are instances of a class then
class-based control can be transformed into inter-shape control.

There are no other organizational facilities in CONSTRAINTS, except perhaps
saying how to construct the RDF dataset over which shape evaluations is
performed, e.g., to have one RDF graph providing data and another an RDFS
ontology combined into the RDF graph over which shapes are evaluated.  All
ontological information in CONSTRAINTS is provided by the RDF graph and is
interpreted as in RDFS.

The interface provided by CONSTRAINTS is simple.  There is one main call
  validate(constraints,data)
where data is an RDF dataset (or RDF graph which is interpreted as the
default graph of an RDF dataset with no named graph) and constraints is a
CONSTRAINTS document.  The call returns true if the constraints are
satisfied by the data, and false otherwise, along with any other information
provided by violations.

It could be useful to also have calls that mirror the control information in
a CONSTRAINTS document.  This would add calls like
  validateGlobal(globalshape,data)
  validateNode(node,localshape,data)
  validateClass(class,localshape,data)
  validateIntershape(shape,shape,data)
with the obvious meanings.

NOTE: Although RDFS is used as the semantic basis of class-based CONSTRAINTS
links, evaluating shapes over RDF graphs that don't use RDFS vocabulary
doesn't need to use any aspect of RDFS.  RDFS class instance then reduces to
explicit rdf:type links in this case, with the sole exceptions that all
properties are implicitly members of rdf:Property and all literal values are
implicitly members of the datatypes to which they belong.


Details of the Shapes in CONSTRAINTS

What remains to be determined is just what facilities are provided by
shapes, including whether and how shapes can be related to other shapes.

One option is that global shapes are SPARQL queries, local shapes are SPARQL
queries with a special variable, and that the core operation is running the
query on the RDFS consequences of the RDF dataset.  This is what is done in
SPIN, although SPIN does not have node-based control or inter-shape control.

Another option is that local shapes are ShExC shape expressions and
interpreted as in ShExC over the RDFS consequences of an RDF graph.

A third option is that local shapes are OWL 2 class expressions, global
shapes are OWL 2 axioms, and that the core operation is evaluating OWL
axioms on the Herbrand model of the RDFS consequences of the RDF graph.

Combinations or variations of these options are also possible.

NOTE: CONSTRAINTS can work in a situation where shapes are unnamed and
cannot refer to other shapes.  CONSTRAINTS uses classes to organize shapes.
Instead of having a shape directly include another shape, the shape is
attached to a class and the other shape attached to a super-class.  Instead
of having a shape use another shape in a property constraint, the constraint
uses a class that has the other shape attached to it.


What is not Present (yet)

CONSTRAINTS does not directly give names to shapes.  It does not provide
any direct means of relating shapes to other shapes.  It does not provide
any direct means of performing recursive shape recognition.  (If the shapes
are ShExC shapes then all this would be provided by ShExC.)

This proposal does not have any notion of shapes covering a graph.

This proposal does not directly incorprate any notion of decorations
attached to shapes or parts of shapes.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1

iQEcBAEBAgAGBQJU5Nn3AAoJECjN6+QThfjzbpsIAJjRKCAHIbV+78rHY7qQyY4p
J9N9doJp3SPFH23XL5Z1Vun3ZZ+ZzLL0gY5RC8Sp7jNdE9C4ZU/Xgg9ad1fRqvAw
sk+XkZWDGLM5ftC6k9F/KHAz/AUQ0oEMBYiniVBQ0absEek7S1AWRaF1m4TNArHh
Ju0IgKqfHBHTOBQFjgrQ11v8t3bjfiZ73ld9+4m7Wk9+cjzetbF8cGd1qF/t7p0W
DOqA7snI4Z98TaZ7BvL58wi9tzMj1msaJbsdVTrUTT8M1zfy1zfFy2FJlVS9ZcaJ
FSjW9pQW/tjpQa0nlqbGwJ9UtkNMyI4IDUZZn8p+WQYzW+/lIiy3ZOery9HxBdE=
=Tm+a
-----END PGP SIGNATURE-----

Received on Wednesday, 18 February 2015 18:29:48 UTC