Comparison with object-oriented validation, such as Java's Bean Validation

Hi Data Shapes community,

I'm very glad you're working on RDF validation standards, which has 
always been something that would really benefit the Semantic Web.  I 
wish I had gotten involved in the conversation before now.

I read the February 2nd SHACL draft and compared it with Java's Bean 
Validation standards [1], which is apparently the most popular approach 
to validating Java data.  I've been thinking about aspects of Bean 
Validation that could be incorporated into SHACL, and aspects of SHACL 
that could be incorporated into Bean Validation.  They posted on 
February 2nd the first draft of a 2.0.0.Alpha1 revision of the Bean 
Validation specification, and they are inviting public comment.

I know it's hard to incorporate new ideas into SHACL at this point.  
Bean Validation has a mechanism for users to specify the order in which 
constraint checks are executed, such as to run inexpensive checks before 
expensive checks.  You would probably consider it inessential to add 
such a mechanism to the SHACL standards, but I thought I would mention 
it.  Maybe it would be useful for SHACL implementations.

The most interesting part of the comparison to me, though, is that SHACL 
can be used mainly directly to validate object-oriented data.  SHACL of 
course works as it is with JSON-LD, but I have in mind to use it with 
all sorts of object-oriented data.  SHACL has some features that would 
be valuable additions to Bean Validation, either as part of the Bean 
Validation standard, or as a separate piece of software.  I expect the 
Bean Validation developers would welcome proposals for incorporating 
aspects of SHACL.

I'm accustomed to working with an interpretation of object-oriented 
classes as unary predicates and as OWL classes, of object-oriented 
attributes as binary predicates and as OWL properties, and so of a set 
of object-oriented data as a set of RDF triples.  Object-oriented data 
form a graph; objects have attributes with values of literals and other 
objects: nodes, edges, and other nodes.  I have a mechanism for 
annotating Java classes and attributes with IRI's, metadata, and 
semantics so that they become OWL classes and properties, similarly to 
how JSON-LD marks up JSON data.

I like the whole idea in SHACL of writing shapes to specify the 
structure of the data graph, which is something Bean Validation could 
include but presently does not.  Bean Validation users writing custom 
constraints mainly have to write the logic of the constraints as Java 
code in an isValid() method.  We might say that SHACL provides a more 
declarative approach to writing constraints.

Consider a Person Java class with attributes for some of the RDF 
properties in the SHACL document -- fullName, firstName, lastName, ssn, 
parent, etc. -- with ex:XoneConstraintExampleShape taken directly from 
the document:


@hasShape(
"@prefix ex: <http://example.com/ns#> . " +
"@prefix sh: <http://www.w3.org/ns/shacl#> . " +
"ex:XoneConstraintExampleShape " +
"    a sh:NodeShape ; " +
"    sh:targetClass ex:Person ; " +
"    sh:xone ( " +
"    [ " +
"        sh:property [ " +
"            sh:path ex:fullName ; " +
"            sh:minCount 1 ; " +
"        ] " +
"    ] " +
"    [ " +
"        sh:property [ " +
"            sh:path ex:firstName ; " +
"            sh:minCount 1 ; " +
"        ] ; " +
"        sh:property [ " +
"            sh:path ex:lastName ; " +
"            sh:minCount 1 ; " +
"        ] " +
"    ] " +
") .")
@hasIRI("http://example.com/ns#Person")
public class Person
{
     @hasIRI("http://example.com/ns#fullName")
     private String fullName;

     @hasIRI("http://example.com/ns#firstName")
     private String firstName;

     @hasIRI("http://example.com/ns#lastName")
     private String lastName;

     @pattern("^\\d{3}-\\d{2}-\\d{4}$")  // sh:pattern
     @hasIRI("http://example.com/ns#ssn")
     private String ssn;

     @maxCount(2)  // sh:maxCount
     @hasIRI("http://example.com/ns#parent")
     private Collection<Person> parent;

     // ...

}


The SHACL constraints would work with the Java data here exactly as they 
do with RDF.  Some of the SHACL constructs would make sense to place as 
annotations directly on attributes, as sh:pattern and sh:maxCount here 
(though Bean Validation already has @Pattern and @Size for those two 
purposes).  Bean Validation doesn't have anything like sh:xone, which I 
show here because I think it would be very useful in Java.  SPARQL 
property paths would be useful to inspect the attributes of nested 
objects. Shapes could also validate arguments to methods.

I wrote some software for incorporating the OWL reasoning and SPARQL 
into Java [2].  My suggestion for using SHACL to validate Java data is a 
direct extension of that approach.  I have a bit of code working to 
validate Java data with the TopBraid SHACL API; I could post the code on 
GitHub if you're interested.

I wanted to post here while you are still open to comments because I 
thought you may not have considered this approach of applying SHACL more 
broadly to object-oriented data.  I also thought there's a chance you're 
not aware that Bean Validation is currently in an open development 
phase, which might somehow be useful to you, as it's closely related to 
SHACL.

Regards,
Tim Armstrong

[1] http://beanvalidation.org/
[2] http://semanticoop.sourceforge.net/

Received on Thursday, 16 March 2017 15:45:05 UTC