Re: ISSUE-23: Where should we query for class and subClassOf? from Holger Knublauch on 2016-01-20 (public-data-shapes-wg@w3.org from January 2016)

From: Holger Knublauch <holger@topquadrant.com>
Date: Wed, 20 Jan 2016 11:09:54 +1000
To: public-data-shapes-wg@w3.org
Message-ID: <569EDE62.10400@topquadrant.com>

(I predict that the separation between shapes and data graph will become 
a FAQ topic for SHACL. It may have been better to leave this topic out 
in version 1 of the standard, as many users will topple over it. Such 
things may break the standard's adoption because they complicate 
everything. In the end, the main benefit of having a shapes graph is 
optimizing performance (so that the data graph is not polluted), yet 
this may be considered premature optimization as engines can probably 
take care of this themselves.)

Anyway, your specific suggestion MAY work for me, as the shapes graph is 
a conceptual/logical entity only and engines may inject any number of 
triples into the shapes graph prior to validation. This doesn't make 
life easier though, so I am not sure.

Two issues come to my mind:

1) if we assume the rdfs:subClassOf triples reside in the shapes graph 
only, then the user/engine needs to take care to consider extensibility: 
the data graph may include extensions of the core ontology that a shapes 
graph was developed against. For example someone may create a subclass 
ex2:Cat while the shape only knows about ex1:Animal. To prepare 
validation, some agent would need to make sure that ex2:Cat 
rdfs:subClassOf ex1:Animal is visible to the shapes graph.

2) sh:class currently looks at the data graph. If we change the behavior 
of sh:scopeClass then we arguably would also need to change sh:class to 
walk the shapes graph.

Holger

On 20/01/2016 4:38 AM, Arthur Ryman wrote:
> While reading the spec I noticed the following statement:
>
> "To determine class membership, the rdf:type and rdfs:subClassOf
> triples are queried in the data graph."
>
> However, querying the data graph for class and subclass information is
> inconsistent with the example SPARQL for determining which shapes are
> classes:
>
> "As syntactic sugar for the scenario above, SHACL includes a rule that
> if a class is also a shape (in the shapes graph), then the
> sh:scopeClass triple pointing at itself can be omitted. This rule is
> illustrated by the following SPARQL CONSTRUCT query, which may be
> executed over the shapes graph prior to validation, to produce the
> implicit sh:scopeClass triples."
>
> CONSTRUCT {
> ?class sh:scopeClass ?class .
> }
> WHERE {
> ?class rdfs:subClassOf*/rdf:type rdfs:Class .
> ?class rdfs:subClassOf*/rdf:type sh:Shape .
> }
>
> I propose that in both cases we query the shapes graph, NOT the data graph.
>
> Recall that we expect the application to provide a shapes graph and a
> data graph as input to the SHACL validator. Therefore, the application
> can always copy any rdfs:Class and rdfs:subClassOf triples into the
> shapes graph. Although RDF does not require class definitions to be
> separated from data instances, in practice these are often separated.
> Both shapes and classes are more properly regarded as metadata than
> data.
>
> -- Arthur
>

Received on Wednesday, 20 January 2016 01:10:34 UTC