Re: modelling using SHACL - a bad idea - ISSUE-23 from Holger Knublauch on 2015-08-04 (public-data-shapes-wg@w3.org from August 2015)

From: Holger Knublauch <holger@topquadrant.com>
Date: Wed, 5 Aug 2015 09:54:00 +1000
To: RDF Data Shapes Working Group <public-data-shapes-wg@w3.org>
Message-ID: <55C15098.9090307@topquadrant.com>
May I remind the working group that we had made a formal resolution [1] 
in February to "explore ways to combine shapes and classes such as 
punning", following very extensive and controversial "shapes-vs-classes" 
discussions. I assume that the term "exploring" was meant with good 
faith, and that WG members now make serious attempts to find a 
compromise here. Some of you may also remember that TopQuadrant's 
original LDOM proposal, based on the established design of SPIN, was 
centered around Classes, and Shapes were some kind of "abstract" 
classes. I still believe this design would have worked at least as good 
as the design around sh:Shape that we agreed as a first step.

A lot of follow-up decisions were made on the assumption that the WG is 
following up on the promise behind this resolution. If the WG now 
decides to not support some form of punning then TopQuadrant will be 
forced to reopen various other design decisions and raise strong formal 
objections against the specification. My sh:ShapeClass proposal was 
already a compromise and I am not going to back away any further on this 
topic. I would even go further and say that if the WG does not 
officially support sh:ShapeClass, then TopQuadrant's tools will still 
support it and establish a de-facto parallel standard.

Having said this I hope that the majority of WG members recognizes that 
the presence of punning and sh:ShapeClass is completely harmless to 
those who do not want to use these features. Users who prefer to stay 
with sh:Shapes only are free to do so. But others who prefer to link 
their shapes with classes should also be allowed to do that. Live and 
let live.

Let me re-cap the situation. We all agree that the following is a 
supported design pattern:

ex:MyClass
     a rdfs:Class .

ex:MyShape
     a sh:Shape ;
     sh:scopeClass ex:MyClass .

All that the current draft is proposing on top of this is that MyClass 
and MyShape may share the same URI ("punning"):

ex:MyClassAndShape
     a rdfs:Class ;
     a sh:Shape ;
     sh:scopeClass ex:MyClass .

and furthermore, *as syntactic sugar*, the class sh:ShapeClass can be 
used as a combination of rdfs:Class and sh:Shape with an implicit 
sh:scopeClass triple in between, allowing the following compact form:

ex:MyClassAndShape
     a sh:ShapeClass .

Judging on past experience and feedback that I have received on the 
SHACL spec, I believe the use of sh:ShapeClass will be very common and 
convenient.


Peter,

in the early months of the WG you stated that you would prefer to use 
OWL for constraint definitions and to not allow any competitor to OWL 
and RDFS on the "modeling" turf. I believe however that the lines 
between modeling and constraint definitions are sufficiently blurry so 
that an intermingling between these worlds is unavoidable, and even 
desirable. When you are describing the shape of a data structure you are 
essentially doing the same thing as the majority of OWL users have 
abused OWL for in the past: you define a group of instances with similar 
characteristics. Countless tools have used OWL and RDFS classes for 
exactly the same roles that we have been tasked with in the Charter: to 
drive user interfaces, web service I/O, validate data etc. The whole 
point of the Data Shapes WG was to fill a perceived gap left by OWL and 
RDFS.

Your points below essentially try to monopolize the notion of RDFS 
classes with inferencing, but in our business experience inferencing is 
completely overrated and just one aspect among many others.

On 8/5/2015 6:26, Peter F. Patel-Schneider wrote:
> Several classes in
> https://github.com/w3c/data-shapes/blob/ISSUE-51/shacl/shacl.shacl.ttl
> show the problems with conflating classes and shapes.  The problematic
> classes include sh:Shape, sh:ShapeClass, and particularly rdfs:Class.
>
> It looks as if
> https://github.com/w3c/data-shapes/blob/ISSUE-51/shacl/shacl.shacl.ttl
> defines an RDFS taxonomy and these nodes are regular RDFS classes in that
> taxonomy.  However, this appearance is deceiving.  Only a small portion of
> the RDFS meaning of classes ends up attaching to these node.

I completely disagree. The by far most important aspect of classes - 
grouping of instances - is used by the draft in the same way as RDFS/OWL do.

>    The full
> meaning of RDFS:subClassOf is not available for these nodes.  Subproperties
> of rdfs:subClass will not have any RDFS effect on these nodes.

We can continue to try and find esoteric corner cases such as 
"sub-properties of rdfs:subClassOf" (I have never ever seen this in my 
10+ years of working with RDFS), or we can try to make attempts towards 
a compromise. See the introduction paragraphs above.

BTW this is no problem at all. There are only a couple of places where 
SHACL relies on rdfs:subClassOf. We could in principle adjust those so 
that they also work regardless of inferencing. However, my preference 
would be to leave the job of inferring the additional rdfs:subClassOf 
triples to inference engines, because these are entirely theoretical 
corner cases. If you disagree, you should raise a formal ISSUE about the 
handling of sub-properties of rdfs:subClassOf and we'll see what others 
in the group think.

>    Providing
> domain and range information for properties related to these nodes will not
> have any RDFS effect.

Could you clarify this? If someone wants RDFS semantics, then they need 
to activate RDFS inferencing. If someone wants the SHACL namespace 
itself to be interpreted by RDFS tools then they can add rdfs:range and 
domain statements themselves. They are just URIs in the end.

>
> Some of the effects of this difference can be seen in
> https://github.com/w3c/data-shapes/blob/ISSUE-51/shacl/shacl.shacl.ttl where
> several classes, including rdfs:Class, are explicitly stated as being
> subclasses of rdfs:Resource, which would be unnecessary under RDFS.

Yes these triples are unnecessary and SHACL does not depend on them. I 
have added them based on a decade of frustration about the RDFS 
specification and countless ontologies that do not assert these triples. 
Tools like TopBraid are forced into running expensive inferences for use 
cases as trivial as displaying a class hierarchy. I wanted to make sure 
that such tools have it easier without relying on inferencing. If you 
believe these "unnecessary" triples should be deleted, please file an ISSUE.

>
> Because of these differences from RDFS, I believe that SHACL should not be
> used for modelling taxonomies and thus that sh:shapeClass should not be part
> of SHACL.

I have no idea what differences you are talking about. Do you have any 
practical example of where the presence of sh:ShapeClass would be 
problematic?

I have extensively used ShapeClass for a few months now and love it. We 
have used SPIN constraints attached to classes for many years and love 
it. This is just a very pragmatic feature.

>
>
> Aside from the above, there are other modelling issues that should not
> be part of SHACL.  SHACL should not be used to set up object-oriented
> notions, such as those coming from sh:abstract, sh:final, and sh:private,
> particularly as there will be no support for enforcing their meaning.

I have no strong opinion about sh:private and sh:final - these would be 
entirely there to help editing tools discourage certain operations by 
the user. But I have just raised ISSUE-78 to have sh:abstract in the 
language. Whether we want it or not, many people will come from an 
object-oriented background, and we should embrace, not alienate them.

Holger

[1] http://www.w3.org/2015/02/18-shapes-minutes.html#resolution01
Received on Tuesday, 4 August 2015 23:54:37 UTC