Re: New Terminology Section from Karen Coyle on 2016-05-10 (public-data-shapes-wg@w3.org from May 2016)

From: Karen Coyle <kcoyle@kcoyle.net>
Date: Tue, 10 May 2016 11:28:39 -0700
To: Martynas Jusevičius <martynas@graphity.org>
Cc: public-data-shapes-wg@w3.org
Message-ID: <57322857.4060704@kcoyle.net>
If that is the approach (and I haven't looked to see if the properties 
involved here are defined as annotation properties, but I can say that 
OWL has not entered into the discussion so far), then that must be made 
clear in the document and in the vocabulary. We also cannot assume Jena 
functionality as a part of the standard.

My gut feeling is that we are wavering between a standard, which can be 
realized in any number of applications with varying additional 
functionality, and the description of an actual application. We need to 
tease those apart. (Quickly, I might add.)

kc

On 5/10/16 10:43 AM, Martynas Jusevičius wrote:
> Hey Karen,
>
> I think these 2 approaches can coexist, if you use OO-like inheritance
> in annotation properties, which do not influence RDF semantics. We too
> this approach in one of our base vocabularies:
> https://github.com/Graphity/graphity-processor/blob/master/src/main/resources/org/graphity/processor/gp.ttl
>
> Annotation properties are inherited using Jena rules, similar to those
> I showed in my previous email.
>
> On Tue, May 10, 2016 at 7:35 PM, Karen Coyle <kcoyle@kcoyle.net> wrote:
>>
>>
>> On 5/10/16 12:18 AM, Martynas Jusevičius wrote:
>>>
>>> Since SPIN takes an object-oriented view on inheritance, my guess is
>>> that SHACL does the same.
>>
>>
>> I have also come to that conclusion, but unfortunately this is contrary to
>> the fact that SHACL is defined in RDF, which does not have the concept of
>> inheritance that exists in OO. This is one of the main issues I have with
>> the way that classes are used in SHACL. The difference is nicely summed up
>> here [1] by Gregg Kellogg. It appears to me that the constraint components
>> (section 3) follow the OO but not the RDF definition of class because the
>> classes there determine the properties that are valid for the class
>> definition (whereas RDF would determine the class membership from the
>> properties).
>>
>> kc
>> [1]
>> http://ruby-rdf.github.io/presentations/HydraConnect2015/assets/player/KeynoteDHTMLPlayer.html#66
>>
>>
>>
>> I had suggested some time ago that it can be
>>>
>>> defined using simple rules (Jena rules in this case):
>>>
>>> [constructors: (?class rdf:type rdfs:Class), (?class
>>> <http://spinrdf.org/spin#constructor> ?o), (?subClass rdfs:subClassOf
>>> ?class), noValue(?subClass <http://spinrdf.org/spin#constructor>) ->
>>> (?subClass <http://spinrdf.org/spin#constructor> ?o) ]
>>> [constraints: (?class rdf:type rdfs:Class), (?class
>>> <http://spinrdf.org/spin#constraint> ?o), (?subClass rdfs:subClassOf
>>> ?class), noValue(?subClass <http://spinrdf.org/spin#constraint>) ->
>>> (?subClass <http://spinrdf.org/spin#constraint> ?o) ]
>>>
>>>
>>> https://groups.google.com/forum/m/#!msg/topbraid-users/vKkcn_5Esek/4vXFHq6MBQAJ
>>>
>>> <https://groups.google.com/forum/m/#%21msg/topbraid-users/vKkcn_5Esek/4vXFHq6MBQAJ>
>>>
>>> On Tue, 10 May 2016 at 08:43, Holger Knublauch <holger@topquadrant.com
>>> <mailto:holger@topquadrant.com>> wrote:
>>>
>>>      On 10/05/2016 14:31, Tom Johnson wrote:
>>>>
>>>>
>>>>
>>>>      On Mon, May 9, 2016 at 8:18 PM, Holger Knublauch
>>>>      <<mailto:holger@topquadrant.com>holger@topquadrant.com
>>>>      <mailto:holger@topquadrant.com>> wrote:
>>>>
>>>>          On 10/05/2016 12:30, Tom Johnson wrote:
>>>>>
>>>>>
>>>>>
>>>>>          On Mon, May 9, 2016 at 5:29 PM, Holger Knublauch
>>>>>          <<mailto:holger@topquadrant.com>holger@topquadrant.com
>>>>>
>>>>>          <mailto:holger@topquadrant.com>> wrote:
>>>>>
>>>>>
>>>>>
>>>>>              On 10/05/2016 10:11, Tom Johnson wrote:
>>>>>>
>>>>>>              Irene, you say:
>>>>>>
>>>>>>              >"Doing more" doesn't create a problem, but, on the other
>>>>>>              hand, it is not required.
>>>>>>
>>>>>>              I'm really uncertain about this. Couldn't inferring
>>>>>>              further class relations (e.g., by using the entailment
>>>>>>              mechanism included in the spec) cause different results
>>>>>>              for basically every operation in SHACL?
>>>>>
>>>>>
>>>>>              Can you think of a specific example? sh:entailment would
>>>>>              potentially produce additional triples. But this is the
>>>>>              user's choice, and then the user may expect to see
>>>>>              additional validation results...
>>>>>
>>>>>
>>>>>          We seem to be in agreement that inferring additional triples
>>>>>          will change results. Examples seem obvious; adding a
>>>>>          `subClassOf` statement whose subject is any class referenced
>>>>>          in a shape will do the trick, but that's far from the only
>>>>>          example.
>>>>>
>>>>>          This seems like a problem to me because I don't see that it's
>>>>>          clear where triples like `subClassOf` must appear (data
>>>>>          graph? shapes graph? any graph?) for a resource to count as a
>>>>>          shape, or to match various constraint components.
>>>>
>>>>
>>>>          To have an effect on sh:scopeClass and sh:class, the
>>>>          subClassOf triples must be in the data graph.
>>>>
>>>>      Is this stated somewhere in the current spec? I haven't been able
>>>>      to find it, if so.
>>>
>>>
>>>      For sh:scopeClass, Section 2.1.2:
>>>
>>>      "Note that, according to the SHACLinstance definition, all
>>>
>>>      the|rdfs:subClassOf|declarations must exist in the data graph."
>>>
>>>      For sh:class the same rules apply as for every other constraint
>>>      component - it looks for triples in the data graph. We could
>>>      theoretically repeat this everywhere, e.g. for sh:minCount, but at
>>>      some stage this should be clear. However, given that multiple people
>>>      have run into this question recently, I have just added a
>>>      clarification to sh:class:
>>>
>>>
>>> https://github.com/w3c/data-shapes/commit/4c0b8f1cbc8faa09624d1a35fc0a8ef564af09b7
>>>
>>>
>>>>
>>>>      Also, the question applies equally to cases where the intent is
>>>>      presumably that (only?) the data graph counts. For instance: which
>>>>      resources count as sh:Shapes?
>>>
>>>
>>>      This would have to be in Section 4, but this is currently under
>>>      revision and may be merged with section 2 shortly, so I'll not touch
>>>      it right now. But the intent is that any Shape definition triples
>>>      such as ex:MyShape rdf:type sh:Shape are only relevant if they are
>>>      in the shapes graph.
>>>
>>>
>>>>>          Note that adding a `subClassOf` triple to a shapes graph to
>>>>>          effect validation could be considered a feature; I'm unsure
>>>>>          whether that feature is supported.
>>>>
>>>>
>>>>          Currently the spec only looks at the data graph.
>>>>
>>>>>
>>>>>          Additionally, `sh:entailment` seems generally
>>>>>          under/un-defined. Can inference effect data graphs only? or
>>>>>          also shapes graphs? Which triples can be considered by a
>>>>>          reasoner and how are inferred triples used by the SHACL
>>>>>          semantics?
>>>>
>>>>
>>>>          I have just clarified this to the sh:entailment section:
>>>>
>>>>
>>>> https://github.com/w3c/data-shapes/commit/71a9eeaff0317de0cdca6b36500412dabc922f78
>>>>
>>>>          I am unsure how many people will actually use sh:entailment,
>>>>          so any feedback/requirement may help us add missing details.
>>>>          It is very brief right now, indeed.
>>>>
>>>>
>>>>      I think some clear definition is called for; otherwise, I would
>>>>      simply remove the feature; is there a functional difference
>>>>      between entailment (in this case) and providing a mechanism for
>>>>      the user/engine to add arbitrary triples to the data or shapes
>>>>      graph during pre-processing? This could be a simpler way to think
>>>>      of the problem.
>>>
>>>
>>>      Regardless of whether sh:entailment exists, any implementer or
>>>      engine already has any freedom to modify the graphs prior to sending
>>>      them to the SHACL engine. This is outside of the SHACL language. The
>>>      rest needs to be decided by the WG, for which I cannot speak here.
>>>
>>>
>>>      Holger
>>>
>>>
>>>
>>>>
>>>>      - Tom
>>>>
>>>>
>>>>          Holger
>>>>
>>>>
>>>>>
>>>>>          Some of my other concerns about the specifics of `class` and
>>>>>          `instance` definitions seem to be in the process of being
>>>>>          fixed up; from a quick reading of the latest editor's draft,
>>>>>          this is looking promising.
>>>>>
>>>>>          - Tom
>>>>>
>>>>>              Thanks, i
>>>>>              Holger
>>>>>
>>>>>
>>>>>
>>>>>>
>>>>>>              In lieu of a repeat of previous conversations, I'll just
>>>>>>              say: For me, as an implementer in waiting, this is a
>>>>>>              huge problem. On last reading, very little seemed
>>>>>>              unambiguously defined.
>>>>>>
>>>>>>              - Tom
>>>>>>
>>>>>>              On Mon, May 9, 2016 at 12:14 PM, Irene Polikoff
>>>>>>              <<mailto:irene@topquadrant.com>irene@topquadrant.com
>>>>>>
>>>>>>              <mailto:irene@topquadrant.com>> wrote:
>>>>>>
>>>>>>                  Karen,
>>>>>>
>>>>>>                  As I understand it, RDFS inferencing is one way to
>>>>>>                  address this. However,
>>>>>>                  RDFS inferencing would do more than what is
>>>>>>                  specified here. "Doing more²
>>>>>>                  doesn¹t create a problem, but, on the other hand, it
>>>>>>                  is not required.
>>>>>>
>>>>>>                  Another way to address this is to run a query as
>>>>>>                  follows:
>>>>>>
>>>>>>                  SELECT ?resource
>>>>>>                  WHERE {
>>>>>>
>>>>>>                  ?class rdfs:subClassOf* example:Class1 .
>>>>>>                  ?resource a ?class .
>>>>>>
>>>>>>                  }
>>>>>>
>>>>>>                  Running this query would not change any graphs. As
>>>>>>                  an aside, RDFS
>>>>>>                  inferencing is also often done without modifying any
>>>>>>                  graphs. Inferences
>>>>>>                  are calculated on the fly when users/systems query
>>>>>>                  data without any
>>>>>>                  materialization of inferred triples. At least, this
>>>>>>                  is how triple stores
>>>>>>                  that support RDFS inferencing typically work.
>>>>>>
>>>>>>                  Does your concern have to do with where the
>>>>>>                  rdfs:subClassOf triples come
>>>>>>                  from - would they exist in the data graph, would
>>>>>>                  they exist in the shapes
>>>>>>                  graph? They could be in either. If no subclass
>>>>>>                  triples are there, then the
>>>>>>                  first triple match simply binds ?class to
>>>>>>                  example:Class1 and the query
>>>>>>                  result is the same as if we were only looking for
>>>>>>                  nodes that are connected
>>>>>>                  to example:Class1 via rdf:type link.
>>>>>>
>>>>>>                  It doesn¹t seem to be a role of SHACL to mandate
>>>>>>                  where these triples
>>>>>>                  should be located. If they are available in either
>>>>>>                  of the graphs, a SHACL
>>>>>>                  engine should take them into account. If they are
>>>>>>                  not available, than it
>>>>>>                  doesn¹t take them into account.
>>>>>>
>>>>>>                  In our experience, users typically put the subclass
>>>>>>                  triples into the
>>>>>>                  shapes graph. At the same time, they need
>>>>>>                  flexibility to do whatever fits
>>>>>>                  their architecture and processes.
>>>>>>
>>>>>>
>>>>>>                  Irene Polikoff
>>>>>>
>>>>>>
>>>>>>                  On 5/9/16, 1:47 PM, "Karen Coyle"
>>>>>>                  <<mailto:kcoyle@kcoyle.net>kcoyle@kcoyle.net
>>>>>>
>>>>>>                  <mailto:kcoyle@kcoyle.net>> wrote:
>>>>>>
>>>>>>                  >Type
>>>>>>                  >The types of a node are its values of rdf:type as
>>>>>>                  well as the
>>>>>>                  >superclasses of these values.
>>>>>>                  >
>>>>>>                  >This conflates two different relationships: the
>>>>>>                  relationship of a
>>>>>>                  >subject to a class (as defined in RDF/RDFS),
>>>>>>                  defining the subject as an
>>>>>>                  >instance of the class; and the sub-/super-class
>>>>>>                  relationships between
>>>>>>                  >classes. I dont' see how this can be achieved
>>>>>>                  without inferencing.
>>>>>>                  >
>>>>>>                  >If we assume some pre-processing of the data graph
>>>>>>                  to include the
>>>>>>                  >superclasses, then type is precisely as it is
>>>>>>                  defined in RDF - there are
>>>>>>                  >just more type statements in the graph.
>>>>>>                  >
>>>>>>                  >As stated, this is quite an expansion of the
>>>>>>                  meaning of type. In
>>>>>>                  >addition, it appears to require modifications to
>>>>>>                  the data graph to
>>>>>>                  >include the super classes of each class (presumably
>>>>>>                  up to and including
>>>>>>                  >rdfs:Resource).
>>>>>>                  >
>>>>>>                  >I think it would be best if SHACL defined the shape
>>>>>>                  and data graphs as
>>>>>>                  >immutable, thus expecting that all operations read
>>>>>>                  but do not modify the
>>>>>>                  >graphs. I thought we had come to that conclusion.
>>>>>>                  >
>>>>>>                  >kc
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>              --
>>>>>>              -Tom Johnson
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>          --
>>>>>          -Tom Johnson
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>      --
>>>>      -Tom Johnson
>>>
>>>
>>
>> --
>> Karen Coyle
>> kcoyle@kcoyle.net http://kcoyle.net
>> m: 1-510-435-8234
>> skype: kcoylenet/+1-510-984-3600
>>
>

-- 
Karen Coyle
kcoyle@kcoyle.net http://kcoyle.net
m: 1-510-435-8234
skype: kcoylenet/+1-510-984-3600
Received on Tuesday, 10 May 2016 18:31:22 UTC