Re: New Terminology Section from Martynas Jusevičius on 2016-05-10 (public-data-shapes-wg@w3.org from May 2016)

From: Martynas Jusevičius <martynas@graphity.org>
Date: Tue, 10 May 2016 19:43:45 +0200
To: Karen Coyle <kcoyle@kcoyle.net>
Cc: public-data-shapes-wg@w3.org
Message-ID: <CAE35VmykMsSDDi0wpe8suRgB9HQNFVc6PW9Jg5_YY_93YVDzMg@mail.gmail.com>
Hey Karen,

I think these 2 approaches can coexist, if you use OO-like inheritance
in annotation properties, which do not influence RDF semantics. We too
this approach in one of our base vocabularies:
https://github.com/Graphity/graphity-processor/blob/master/src/main/resources/org/graphity/processor/gp.ttl

Annotation properties are inherited using Jena rules, similar to those
I showed in my previous email.

On Tue, May 10, 2016 at 7:35 PM, Karen Coyle <kcoyle@kcoyle.net> wrote:
>
>
> On 5/10/16 12:18 AM, Martynas Jusevičius wrote:
>>
>> Since SPIN takes an object-oriented view on inheritance, my guess is
>> that SHACL does the same.
>
>
> I have also come to that conclusion, but unfortunately this is contrary to
> the fact that SHACL is defined in RDF, which does not have the concept of
> inheritance that exists in OO. This is one of the main issues I have with
> the way that classes are used in SHACL. The difference is nicely summed up
> here [1] by Gregg Kellogg. It appears to me that the constraint components
> (section 3) follow the OO but not the RDF definition of class because the
> classes there determine the properties that are valid for the class
> definition (whereas RDF would determine the class membership from the
> properties).
>
> kc
> [1]
> http://ruby-rdf.github.io/presentations/HydraConnect2015/assets/player/KeynoteDHTMLPlayer.html#66
>
>
>
> I had suggested some time ago that it can be
>>
>> defined using simple rules (Jena rules in this case):
>>
>> [constructors: (?class rdf:type rdfs:Class), (?class
>> <http://spinrdf.org/spin#constructor> ?o), (?subClass rdfs:subClassOf
>> ?class), noValue(?subClass <http://spinrdf.org/spin#constructor>) ->
>> (?subClass <http://spinrdf.org/spin#constructor> ?o) ]
>> [constraints: (?class rdf:type rdfs:Class), (?class
>> <http://spinrdf.org/spin#constraint> ?o), (?subClass rdfs:subClassOf
>> ?class), noValue(?subClass <http://spinrdf.org/spin#constraint>) ->
>> (?subClass <http://spinrdf.org/spin#constraint> ?o) ]
>>
>>
>> https://groups.google.com/forum/m/#!msg/topbraid-users/vKkcn_5Esek/4vXFHq6MBQAJ
>>
>> <https://groups.google.com/forum/m/#%21msg/topbraid-users/vKkcn_5Esek/4vXFHq6MBQAJ>
>>
>> On Tue, 10 May 2016 at 08:43, Holger Knublauch <holger@topquadrant.com
>> <mailto:holger@topquadrant.com>> wrote:
>>
>>     On 10/05/2016 14:31, Tom Johnson wrote:
>>>
>>>
>>>
>>>     On Mon, May 9, 2016 at 8:18 PM, Holger Knublauch
>>>     <<mailto:holger@topquadrant.com>holger@topquadrant.com
>>>     <mailto:holger@topquadrant.com>> wrote:
>>>
>>>         On 10/05/2016 12:30, Tom Johnson wrote:
>>>>
>>>>
>>>>
>>>>         On Mon, May 9, 2016 at 5:29 PM, Holger Knublauch
>>>>         <<mailto:holger@topquadrant.com>holger@topquadrant.com
>>>>
>>>>         <mailto:holger@topquadrant.com>> wrote:
>>>>
>>>>
>>>>
>>>>             On 10/05/2016 10:11, Tom Johnson wrote:
>>>>>
>>>>>             Irene, you say:
>>>>>
>>>>>             >"Doing more" doesn't create a problem, but, on the other
>>>>>             hand, it is not required.
>>>>>
>>>>>             I'm really uncertain about this. Couldn't inferring
>>>>>             further class relations (e.g., by using the entailment
>>>>>             mechanism included in the spec) cause different results
>>>>>             for basically every operation in SHACL?
>>>>
>>>>
>>>>             Can you think of a specific example? sh:entailment would
>>>>             potentially produce additional triples. But this is the
>>>>             user's choice, and then the user may expect to see
>>>>             additional validation results...
>>>>
>>>>
>>>>         We seem to be in agreement that inferring additional triples
>>>>         will change results. Examples seem obvious; adding a
>>>>         `subClassOf` statement whose subject is any class referenced
>>>>         in a shape will do the trick, but that's far from the only
>>>>         example.
>>>>
>>>>         This seems like a problem to me because I don't see that it's
>>>>         clear where triples like `subClassOf` must appear (data
>>>>         graph? shapes graph? any graph?) for a resource to count as a
>>>>         shape, or to match various constraint components.
>>>
>>>
>>>         To have an effect on sh:scopeClass and sh:class, the
>>>         subClassOf triples must be in the data graph.
>>>
>>>     Is this stated somewhere in the current spec? I haven't been able
>>>     to find it, if so.
>>
>>
>>     For sh:scopeClass, Section 2.1.2:
>>
>>     "Note that, according to the SHACLinstance definition, all
>>
>>     the|rdfs:subClassOf|declarations must exist in the data graph."
>>
>>     For sh:class the same rules apply as for every other constraint
>>     component - it looks for triples in the data graph. We could
>>     theoretically repeat this everywhere, e.g. for sh:minCount, but at
>>     some stage this should be clear. However, given that multiple people
>>     have run into this question recently, I have just added a
>>     clarification to sh:class:
>>
>>
>> https://github.com/w3c/data-shapes/commit/4c0b8f1cbc8faa09624d1a35fc0a8ef564af09b7
>>
>>
>>>
>>>     Also, the question applies equally to cases where the intent is
>>>     presumably that (only?) the data graph counts. For instance: which
>>>     resources count as sh:Shapes?
>>
>>
>>     This would have to be in Section 4, but this is currently under
>>     revision and may be merged with section 2 shortly, so I'll not touch
>>     it right now. But the intent is that any Shape definition triples
>>     such as ex:MyShape rdf:type sh:Shape are only relevant if they are
>>     in the shapes graph.
>>
>>
>>>>         Note that adding a `subClassOf` triple to a shapes graph to
>>>>         effect validation could be considered a feature; I'm unsure
>>>>         whether that feature is supported.
>>>
>>>
>>>         Currently the spec only looks at the data graph.
>>>
>>>>
>>>>         Additionally, `sh:entailment` seems generally
>>>>         under/un-defined. Can inference effect data graphs only? or
>>>>         also shapes graphs? Which triples can be considered by a
>>>>         reasoner and how are inferred triples used by the SHACL
>>>>         semantics?
>>>
>>>
>>>         I have just clarified this to the sh:entailment section:
>>>
>>>
>>> https://github.com/w3c/data-shapes/commit/71a9eeaff0317de0cdca6b36500412dabc922f78
>>>
>>>         I am unsure how many people will actually use sh:entailment,
>>>         so any feedback/requirement may help us add missing details.
>>>         It is very brief right now, indeed.
>>>
>>>
>>>     I think some clear definition is called for; otherwise, I would
>>>     simply remove the feature; is there a functional difference
>>>     between entailment (in this case) and providing a mechanism for
>>>     the user/engine to add arbitrary triples to the data or shapes
>>>     graph during pre-processing? This could be a simpler way to think
>>>     of the problem.
>>
>>
>>     Regardless of whether sh:entailment exists, any implementer or
>>     engine already has any freedom to modify the graphs prior to sending
>>     them to the SHACL engine. This is outside of the SHACL language. The
>>     rest needs to be decided by the WG, for which I cannot speak here.
>>
>>
>>     Holger
>>
>>
>>
>>>
>>>     - Tom
>>>
>>>
>>>         Holger
>>>
>>>
>>>>
>>>>         Some of my other concerns about the specifics of `class` and
>>>>         `instance` definitions seem to be in the process of being
>>>>         fixed up; from a quick reading of the latest editor's draft,
>>>>         this is looking promising.
>>>>
>>>>         - Tom
>>>>
>>>>             Thanks, i
>>>>             Holger
>>>>
>>>>
>>>>
>>>>>
>>>>>             In lieu of a repeat of previous conversations, I'll just
>>>>>             say: For me, as an implementer in waiting, this is a
>>>>>             huge problem. On last reading, very little seemed
>>>>>             unambiguously defined.
>>>>>
>>>>>             - Tom
>>>>>
>>>>>             On Mon, May 9, 2016 at 12:14 PM, Irene Polikoff
>>>>>             <<mailto:irene@topquadrant.com>irene@topquadrant.com
>>>>>
>>>>>             <mailto:irene@topquadrant.com>> wrote:
>>>>>
>>>>>                 Karen,
>>>>>
>>>>>                 As I understand it, RDFS inferencing is one way to
>>>>>                 address this. However,
>>>>>                 RDFS inferencing would do more than what is
>>>>>                 specified here. "Doing more²
>>>>>                 doesn¹t create a problem, but, on the other hand, it
>>>>>                 is not required.
>>>>>
>>>>>                 Another way to address this is to run a query as
>>>>>                 follows:
>>>>>
>>>>>                 SELECT ?resource
>>>>>                 WHERE {
>>>>>
>>>>>                 ?class rdfs:subClassOf* example:Class1 .
>>>>>                 ?resource a ?class .
>>>>>
>>>>>                 }
>>>>>
>>>>>                 Running this query would not change any graphs. As
>>>>>                 an aside, RDFS
>>>>>                 inferencing is also often done without modifying any
>>>>>                 graphs. Inferences
>>>>>                 are calculated on the fly when users/systems query
>>>>>                 data without any
>>>>>                 materialization of inferred triples. At least, this
>>>>>                 is how triple stores
>>>>>                 that support RDFS inferencing typically work.
>>>>>
>>>>>                 Does your concern have to do with where the
>>>>>                 rdfs:subClassOf triples come
>>>>>                 from - would they exist in the data graph, would
>>>>>                 they exist in the shapes
>>>>>                 graph? They could be in either. If no subclass
>>>>>                 triples are there, then the
>>>>>                 first triple match simply binds ?class to
>>>>>                 example:Class1 and the query
>>>>>                 result is the same as if we were only looking for
>>>>>                 nodes that are connected
>>>>>                 to example:Class1 via rdf:type link.
>>>>>
>>>>>                 It doesn¹t seem to be a role of SHACL to mandate
>>>>>                 where these triples
>>>>>                 should be located. If they are available in either
>>>>>                 of the graphs, a SHACL
>>>>>                 engine should take them into account. If they are
>>>>>                 not available, than it
>>>>>                 doesn¹t take them into account.
>>>>>
>>>>>                 In our experience, users typically put the subclass
>>>>>                 triples into the
>>>>>                 shapes graph. At the same time, they need
>>>>>                 flexibility to do whatever fits
>>>>>                 their architecture and processes.
>>>>>
>>>>>
>>>>>                 Irene Polikoff
>>>>>
>>>>>
>>>>>                 On 5/9/16, 1:47 PM, "Karen Coyle"
>>>>>                 <<mailto:kcoyle@kcoyle.net>kcoyle@kcoyle.net
>>>>>
>>>>>                 <mailto:kcoyle@kcoyle.net>> wrote:
>>>>>
>>>>>                 >Type
>>>>>                 >The types of a node are its values of rdf:type as
>>>>>                 well as the
>>>>>                 >superclasses of these values.
>>>>>                 >
>>>>>                 >This conflates two different relationships: the
>>>>>                 relationship of a
>>>>>                 >subject to a class (as defined in RDF/RDFS),
>>>>>                 defining the subject as an
>>>>>                 >instance of the class; and the sub-/super-class
>>>>>                 relationships between
>>>>>                 >classes. I dont' see how this can be achieved
>>>>>                 without inferencing.
>>>>>                 >
>>>>>                 >If we assume some pre-processing of the data graph
>>>>>                 to include the
>>>>>                 >superclasses, then type is precisely as it is
>>>>>                 defined in RDF - there are
>>>>>                 >just more type statements in the graph.
>>>>>                 >
>>>>>                 >As stated, this is quite an expansion of the
>>>>>                 meaning of type. In
>>>>>                 >addition, it appears to require modifications to
>>>>>                 the data graph to
>>>>>                 >include the super classes of each class (presumably
>>>>>                 up to and including
>>>>>                 >rdfs:Resource).
>>>>>                 >
>>>>>                 >I think it would be best if SHACL defined the shape
>>>>>                 and data graphs as
>>>>>                 >immutable, thus expecting that all operations read
>>>>>                 but do not modify the
>>>>>                 >graphs. I thought we had come to that conclusion.
>>>>>                 >
>>>>>                 >kc
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>             --
>>>>>             -Tom Johnson
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>         --
>>>>         -Tom Johnson
>>>
>>>
>>>
>>>
>>>
>>>     --
>>>     -Tom Johnson
>>
>>
>
> --
> Karen Coyle
> kcoyle@kcoyle.net http://kcoyle.net
> m: 1-510-435-8234
> skype: kcoylenet/+1-510-984-3600
>
Received on Tuesday, 10 May 2016 17:44:14 UTC