- From: Holger Knublauch <holger@topquadrant.com>
- Date: Tue, 27 Jan 2015 10:51:41 +1000
- To: RDF Data Shapes Working Group <public-data-shapes-wg@w3.org>
The WG still struggles to find common ground on something that is entirely a war about terminology. This should be easy to fix. Let's try by coming back to the specific language syntax. I really don't want to talk about philosophical differences now. There is a fairly obvious mapping between the proposed terms Shape = Class := Collection of nodes with shared characteristics ShEx people may prefer ldom:Shape, but the syntax could also use rdfs:Class Proposals to associate nodes with a shape are either ldom:instanceShape or rdf:type Proposals to define reuse/inheritance of shapes are either ldom:classShape or rdfs:subClassOf I looked at the LDOM engine, and it could theoretically support all options. This would require to duplicate every piece of logic that currently uses rdf:type to also use ldom:instanceShape, and every piece of logic based on rdfs:subClassOf to also walk up the ldom:classShape hierarchy. Then this duplication propagates into every SPARQL query. A completely unnecessary nightmare only because of different wording. It's technically 100% the same thing. Every function or feature that takes a shape as a parameter can also take an rdfs:Class. Nothing is lost by using rdfs:Class directly, especially because it already has inheritance solved. Given this duplication, there is a strong risk that the W3C consensus process forces us to create yet another bloated standard that is the union of many similar ideas because nobody is ready to give up their individual preference. But the result would be a standard that nobody understands and implements because it is too complicating. OTOH, the concepts used by LDOM (rdfs:Class, rdf:type, rdfs:subClassOf) all perfectly align with current practice and are easy to understand by most mainstream developers. We can easily attract a large crowd of JSON developers, among others, and increase the size of linked data community by several orders of magnitude. These people don't care about the theoretical distinction between Shapes and Classes that not even we seem to fully understand. Looking at ShEx I believe one of the driving forces was the ShExC compact syntax, and I can certainly see that some people may find such a thing useful. ShExC is a language on its own. It can have many different engine implementation, some already exist and did not require LDOM. However, ShExC can also be defined as a mapping into the RDF syntax. ShEx already has an RDF syntax. All we need to do is make sure that it can also be mapped into LDOM RDF triples. We can define an LDOM profile [1] for ShExC that includes ldom:minCount/maxCount/valueType, OrConstraint etc and then publish a W3C document that formalizes the mapping into that profile. The ShExC developers then have their standard that they can continue to research on. The flexibility of the LDOM templating mechanism even means that they can add new language elements or syntax extensions without requiring changes to the core language. The key is that ShExC is its own text syntax anyway, so who cares whether it gets mapped to ldom:Shape or rdfs:Class for execution? There is no difference for the user, only that we can keep LDOM simple. People who write papers or books about this language can still say "To declare a Shape you write [ShExC snippet] or [ex:Person a rdfs:Class]". It's an entirely syntactic detail whether these are called Shape or Class. The other issue brought forward by Eric is that there are many ways to trigger the constraint checking. LDOM suggests to use rdf:type to point at the shapes that a given node should be evaluated against *by default*. But that is easy to override. The API will have an entry point that checks whether a given Node fulfills a given Shape (currently implemented as ldom:violatesConstraints, maybe renamed to ldom:hasShape). Any application can call this function with their own protocol. It could be part of an HTTP header or whatever, or an application dynamically adds rdf:type triple to drive the execution engine. There are many ways in which this can work, and many of those ways are outside of the scope of this WG and better left to Linked Data Platform or Hydra, or even local to the ShExC specification. Can we please work together to create the right layering of this technology stack so that everyone gets what they need, instead of a messed up hybrid solution where everything is just thrown together? Thanks, Holger [1] https://w3c.github.io/data-shapes/data-shapes-primer/#profiles
Received on Tuesday, 27 January 2015 00:52:14 UTC