IBM Position on OSLC Shapes, SPARQL, SPIN, and ShEx from Arthur Ryman on 2014-07-25 (public-rdf-shapes@w3.org from July 2014)

From: Arthur Ryman <ryman@ca.ibm.com>
Date: Fri, 25 Jul 2014 15:46:09 -0400
To: public-rdf-shapes@w3.org
Message-ID: <OFE182D2A2.0BE32614-ON85257D20.00692713-85257D20.006C9B51@ca.ibm.com>

I'd like to clarify IBM's position on these topics.

1. There is a well-motivated need for a way to describe RDF resources. 
This includes a description of the expected triples in a resource (aka 
graph) and the constraints they satisfy. The W3C semantics for RDFS and 
OWL were about inferring triples, not checking constraints. Clark & Parsia 
had previously proposed the ICV semantics for OWL, but this was not 
submitted to W3C. There was a gap in the standards space, so at OSLC we 
developed the Shape spec. We used the term "Shape" because "Schema" was 
already used by RDFS for inferring triples. We thought "Shape" was a good 
term because RDF resources can be visualized as graphs.

2. We believe that there should be a high-level vocabulary for describing 
the most common constraints. This includes things like occurence 
(cardinality), domains, ranges, etc. This vocabulary should be easy to 
understand by typical developers. It should also be easy to consume by 
tools. Since any consuming tool would be dealing with RDF resources, the 
simplest syntax for the constraints is some RDF syntax, e.g. Turtle. We 
did not want to create a new syntax since that would require the 
corresponding development of parsers, and that would slow down adoption.

3. We believe that the semantics of the high-level constraint vocabulary 
should be precisely defined in order to promote good interoperability. The 
simplest way to define semantics is to express the meaning terms by 
translating them into some pre-existing language that has well-defined 
semantics. The obvious choice is SPARQL since it is a W3C standard, has 
many implementations, and defines ASK queries. The choice of SPARQL to 
define semantics does not imply SPARQL must be used for implementations. 
In fact, most of the systems that implement OSLC specifications are not 
natively built on RDF or SPARQL. 

4. We believe that there should be an extension mechanism to define more 
complex constraints. SPARQL again is a good choice for the same reasons as 
#3.

5. Rather than invent a new mechanism for associating SPARQL constraints 
with RDF resources, we believe that SPIN provides a good starting point 
since it has been submitted to W3C, has more than one implementation, and 
there is a good body of implementation experience with it.

6. ShEx is an interesting approach and we are happy to see it developed 
further, perhaps as a transformation engine, but its use for constraint 
checking is not compelling at this stage since we already have more mature 
alternatives. Our starting position should be SPARQL and SPIN. If these 
fail to satisfy the use cases, then ShEx should be re-examined.

That being said, we regard OSLC Shapes and SPIN as being inputs to the 
standardization process and expect them to be modified in the normal 
course of standards development.

Regards, 
___________________________________________________________________________
Arthur Ryman, PhD

Chief Data Officer, Rational
Chief Architect, Portfolio & Strategy Management
Distinguished Engineer | Master Inventor | Academy of Technology

Toronto Lab | +1-905-413-3077 (office) | +1-416-939-5063 (mobile)

Received on Friday, 25 July 2014 19:46:54 UTC