Re: New Terminology Section from Tom Johnson on 2016-05-10 (public-data-shapes-wg@w3.org from May 2016)

From: Tom Johnson <johnson.tom@gmail.com>
Date: Tue, 10 May 2016 11:43:04 -0700
To: Karen Coyle <kcoyle@kcoyle.net>
Cc: Martynas Jusevičius <martynas@graphity.org>, RDF Data Shapes Working Group <public-data-shapes-wg@w3.org>
Message-ID: <CAJeHiNHSfKGfs2wjZdgcBq3=jwNWSDtwGXe8_38tLgbPmacEow@mail.gmail.com>
> My gut feeling is that we are wavering between a standard, which can be
realized in any number of applications with varying additional
functionality, and the description of an actual application. We need to
tease those apart. (Quickly, I might add.)

+1

On Tue, May 10, 2016 at 11:28 AM, Karen Coyle <kcoyle@kcoyle.net> wrote:

> If that is the approach (and I haven't looked to see if the properties
> involved here are defined as annotation properties, but I can say that OWL
> has not entered into the discussion so far), then that must be made clear
> in the document and in the vocabulary. We also cannot assume Jena
> functionality as a part of the standard.
>
> My gut feeling is that we are wavering between a standard, which can be
> realized in any number of applications with varying additional
> functionality, and the description of an actual application. We need to
> tease those apart. (Quickly, I might add.)
>
> kc
>
>
> On 5/10/16 10:43 AM, Martynas Jusevičius wrote:
>
>> Hey Karen,
>>
>> I think these 2 approaches can coexist, if you use OO-like inheritance
>> in annotation properties, which do not influence RDF semantics. We too
>> this approach in one of our base vocabularies:
>>
>> https://github.com/Graphity/graphity-processor/blob/master/src/main/resources/org/graphity/processor/gp.ttl
>>
>> Annotation properties are inherited using Jena rules, similar to those
>> I showed in my previous email.
>>
>> On Tue, May 10, 2016 at 7:35 PM, Karen Coyle <kcoyle@kcoyle.net> wrote:
>>
>>>
>>>
>>> On 5/10/16 12:18 AM, Martynas Jusevičius wrote:
>>>
>>>>
>>>> Since SPIN takes an object-oriented view on inheritance, my guess is
>>>> that SHACL does the same.
>>>>
>>>
>>>
>>> I have also come to that conclusion, but unfortunately this is contrary
>>> to
>>> the fact that SHACL is defined in RDF, which does not have the concept of
>>> inheritance that exists in OO. This is one of the main issues I have with
>>> the way that classes are used in SHACL. The difference is nicely summed
>>> up
>>> here [1] by Gregg Kellogg. It appears to me that the constraint
>>> components
>>> (section 3) follow the OO but not the RDF definition of class because the
>>> classes there determine the properties that are valid for the class
>>> definition (whereas RDF would determine the class membership from the
>>> properties).
>>>
>>> kc
>>> [1]
>>>
>>> http://ruby-rdf.github.io/presentations/HydraConnect2015/assets/player/KeynoteDHTMLPlayer.html#66
>>>
>>>
>>>
>>> I had suggested some time ago that it can be
>>>
>>>>
>>>> defined using simple rules (Jena rules in this case):
>>>>
>>>> [constructors: (?class rdf:type rdfs:Class), (?class
>>>> <http://spinrdf.org/spin#constructor> ?o), (?subClass rdfs:subClassOf
>>>> ?class), noValue(?subClass <http://spinrdf.org/spin#constructor>) ->
>>>> (?subClass <http://spinrdf.org/spin#constructor> ?o) ]
>>>> [constraints: (?class rdf:type rdfs:Class), (?class
>>>> <http://spinrdf.org/spin#constraint> ?o), (?subClass rdfs:subClassOf
>>>> ?class), noValue(?subClass <http://spinrdf.org/spin#constraint>) ->
>>>> (?subClass <http://spinrdf.org/spin#constraint> ?o) ]
>>>>
>>>>
>>>>
>>>> https://groups.google.com/forum/m/#!msg/topbraid-users/vKkcn_5Esek/4vXFHq6MBQAJ
>>>>
>>>> <
>>>> https://groups.google.com/forum/m/#%21msg/topbraid-users/vKkcn_5Esek/4vXFHq6MBQAJ
>>>> >
>>>>
>>>> On Tue, 10 May 2016 at 08:43, Holger Knublauch <holger@topquadrant.com
>>>> <mailto:holger@topquadrant.com>> wrote:
>>>>
>>>>      On 10/05/2016 14:31, Tom Johnson wrote:
>>>>
>>>>>
>>>>>
>>>>>
>>>>>      On Mon, May 9, 2016 at 8:18 PM, Holger Knublauch
>>>>>      <<mailto:holger@topquadrant.com>holger@topquadrant.com
>>>>>      <mailto:holger@topquadrant.com>> wrote:
>>>>>
>>>>>          On 10/05/2016 12:30, Tom Johnson wrote:
>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>          On Mon, May 9, 2016 at 5:29 PM, Holger Knublauch
>>>>>>          <<mailto:holger@topquadrant.com>holger@topquadrant.com
>>>>>>
>>>>>>          <mailto:holger@topquadrant.com>> wrote:
>>>>>>
>>>>>>
>>>>>>
>>>>>>              On 10/05/2016 10:11, Tom Johnson wrote:
>>>>>>
>>>>>>>
>>>>>>>              Irene, you say:
>>>>>>>
>>>>>>>              >"Doing more" doesn't create a problem, but, on the
>>>>>>> other
>>>>>>>              hand, it is not required.
>>>>>>>
>>>>>>>              I'm really uncertain about this. Couldn't inferring
>>>>>>>              further class relations (e.g., by using the entailment
>>>>>>>              mechanism included in the spec) cause different results
>>>>>>>              for basically every operation in SHACL?
>>>>>>>
>>>>>>
>>>>>>
>>>>>>              Can you think of a specific example? sh:entailment would
>>>>>>              potentially produce additional triples. But this is the
>>>>>>              user's choice, and then the user may expect to see
>>>>>>              additional validation results...
>>>>>>
>>>>>>
>>>>>>          We seem to be in agreement that inferring additional triples
>>>>>>          will change results. Examples seem obvious; adding a
>>>>>>          `subClassOf` statement whose subject is any class referenced
>>>>>>          in a shape will do the trick, but that's far from the only
>>>>>>          example.
>>>>>>
>>>>>>          This seems like a problem to me because I don't see that it's
>>>>>>          clear where triples like `subClassOf` must appear (data
>>>>>>          graph? shapes graph? any graph?) for a resource to count as a
>>>>>>          shape, or to match various constraint components.
>>>>>>
>>>>>
>>>>>
>>>>>          To have an effect on sh:scopeClass and sh:class, the
>>>>>          subClassOf triples must be in the data graph.
>>>>>
>>>>>      Is this stated somewhere in the current spec? I haven't been able
>>>>>      to find it, if so.
>>>>>
>>>>
>>>>
>>>>      For sh:scopeClass, Section 2.1.2:
>>>>
>>>>      "Note that, according to the SHACLinstance definition, all
>>>>
>>>>      the|rdfs:subClassOf|declarations must exist in the data graph."
>>>>
>>>>      For sh:class the same rules apply as for every other constraint
>>>>      component - it looks for triples in the data graph. We could
>>>>      theoretically repeat this everywhere, e.g. for sh:minCount, but at
>>>>      some stage this should be clear. However, given that multiple
>>>> people
>>>>      have run into this question recently, I have just added a
>>>>      clarification to sh:class:
>>>>
>>>>
>>>>
>>>> https://github.com/w3c/data-shapes/commit/4c0b8f1cbc8faa09624d1a35fc0a8ef564af09b7
>>>>
>>>>
>>>>
>>>>>      Also, the question applies equally to cases where the intent is
>>>>>      presumably that (only?) the data graph counts. For instance: which
>>>>>      resources count as sh:Shapes?
>>>>>
>>>>
>>>>
>>>>      This would have to be in Section 4, but this is currently under
>>>>      revision and may be merged with section 2 shortly, so I'll not
>>>> touch
>>>>      it right now. But the intent is that any Shape definition triples
>>>>      such as ex:MyShape rdf:type sh:Shape are only relevant if they are
>>>>      in the shapes graph.
>>>>
>>>>
>>>>          Note that adding a `subClassOf` triple to a shapes graph to
>>>>>>          effect validation could be considered a feature; I'm unsure
>>>>>>          whether that feature is supported.
>>>>>>
>>>>>
>>>>>
>>>>>          Currently the spec only looks at the data graph.
>>>>>
>>>>>
>>>>>>          Additionally, `sh:entailment` seems generally
>>>>>>          under/un-defined. Can inference effect data graphs only? or
>>>>>>          also shapes graphs? Which triples can be considered by a
>>>>>>          reasoner and how are inferred triples used by the SHACL
>>>>>>          semantics?
>>>>>>
>>>>>
>>>>>
>>>>>          I have just clarified this to the sh:entailment section:
>>>>>
>>>>>
>>>>>
>>>>> https://github.com/w3c/data-shapes/commit/71a9eeaff0317de0cdca6b36500412dabc922f78
>>>>>
>>>>>          I am unsure how many people will actually use sh:entailment,
>>>>>          so any feedback/requirement may help us add missing details.
>>>>>          It is very brief right now, indeed.
>>>>>
>>>>>
>>>>>      I think some clear definition is called for; otherwise, I would
>>>>>      simply remove the feature; is there a functional difference
>>>>>      between entailment (in this case) and providing a mechanism for
>>>>>      the user/engine to add arbitrary triples to the data or shapes
>>>>>      graph during pre-processing? This could be a simpler way to think
>>>>>      of the problem.
>>>>>
>>>>
>>>>
>>>>      Regardless of whether sh:entailment exists, any implementer or
>>>>      engine already has any freedom to modify the graphs prior to
>>>> sending
>>>>      them to the SHACL engine. This is outside of the SHACL language.
>>>> The
>>>>      rest needs to be decided by the WG, for which I cannot speak here.
>>>>
>>>>
>>>>      Holger
>>>>
>>>>
>>>>
>>>>
>>>>>      - Tom
>>>>>
>>>>>
>>>>>          Holger
>>>>>
>>>>>
>>>>>
>>>>>>          Some of my other concerns about the specifics of `class` and
>>>>>>          `instance` definitions seem to be in the process of being
>>>>>>          fixed up; from a quick reading of the latest editor's draft,
>>>>>>          this is looking promising.
>>>>>>
>>>>>>          - Tom
>>>>>>
>>>>>>              Thanks, i
>>>>>>              Holger
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>>              In lieu of a repeat of previous conversations, I'll just
>>>>>>>              say: For me, as an implementer in waiting, this is a
>>>>>>>              huge problem. On last reading, very little seemed
>>>>>>>              unambiguously defined.
>>>>>>>
>>>>>>>              - Tom
>>>>>>>
>>>>>>>              On Mon, May 9, 2016 at 12:14 PM, Irene Polikoff
>>>>>>>              <<mailto:irene@topquadrant.com>irene@topquadrant.com
>>>>>>>
>>>>>>>              <mailto:irene@topquadrant.com>> wrote:
>>>>>>>
>>>>>>>                  Karen,
>>>>>>>
>>>>>>>                  As I understand it, RDFS inferencing is one way to
>>>>>>>                  address this. However,
>>>>>>>                  RDFS inferencing would do more than what is
>>>>>>>                  specified here. "Doing more²
>>>>>>>                  doesn¹t create a problem, but, on the other hand, it
>>>>>>>                  is not required.
>>>>>>>
>>>>>>>                  Another way to address this is to run a query as
>>>>>>>                  follows:
>>>>>>>
>>>>>>>                  SELECT ?resource
>>>>>>>                  WHERE {
>>>>>>>
>>>>>>>                  ?class rdfs:subClassOf* example:Class1 .
>>>>>>>                  ?resource a ?class .
>>>>>>>
>>>>>>>                  }
>>>>>>>
>>>>>>>                  Running this query would not change any graphs. As
>>>>>>>                  an aside, RDFS
>>>>>>>                  inferencing is also often done without modifying any
>>>>>>>                  graphs. Inferences
>>>>>>>                  are calculated on the fly when users/systems query
>>>>>>>                  data without any
>>>>>>>                  materialization of inferred triples. At least, this
>>>>>>>                  is how triple stores
>>>>>>>                  that support RDFS inferencing typically work.
>>>>>>>
>>>>>>>                  Does your concern have to do with where the
>>>>>>>                  rdfs:subClassOf triples come
>>>>>>>                  from - would they exist in the data graph, would
>>>>>>>                  they exist in the shapes
>>>>>>>                  graph? They could be in either. If no subclass
>>>>>>>                  triples are there, then the
>>>>>>>                  first triple match simply binds ?class to
>>>>>>>                  example:Class1 and the query
>>>>>>>                  result is the same as if we were only looking for
>>>>>>>                  nodes that are connected
>>>>>>>                  to example:Class1 via rdf:type link.
>>>>>>>
>>>>>>>                  It doesn¹t seem to be a role of SHACL to mandate
>>>>>>>                  where these triples
>>>>>>>                  should be located. If they are available in either
>>>>>>>                  of the graphs, a SHACL
>>>>>>>                  engine should take them into account. If they are
>>>>>>>                  not available, than it
>>>>>>>                  doesn¹t take them into account.
>>>>>>>
>>>>>>>                  In our experience, users typically put the subclass
>>>>>>>                  triples into the
>>>>>>>                  shapes graph. At the same time, they need
>>>>>>>                  flexibility to do whatever fits
>>>>>>>                  their architecture and processes.
>>>>>>>
>>>>>>>
>>>>>>>                  Irene Polikoff
>>>>>>>
>>>>>>>
>>>>>>>                  On 5/9/16, 1:47 PM, "Karen Coyle"
>>>>>>>                  <<mailto:kcoyle@kcoyle.net>kcoyle@kcoyle.net
>>>>>>>
>>>>>>>                  <mailto:kcoyle@kcoyle.net>> wrote:
>>>>>>>
>>>>>>>                  >Type
>>>>>>>                  >The types of a node are its values of rdf:type as
>>>>>>>                  well as the
>>>>>>>                  >superclasses of these values.
>>>>>>>                  >
>>>>>>>                  >This conflates two different relationships: the
>>>>>>>                  relationship of a
>>>>>>>                  >subject to a class (as defined in RDF/RDFS),
>>>>>>>                  defining the subject as an
>>>>>>>                  >instance of the class; and the sub-/super-class
>>>>>>>                  relationships between
>>>>>>>                  >classes. I dont' see how this can be achieved
>>>>>>>                  without inferencing.
>>>>>>>                  >
>>>>>>>                  >If we assume some pre-processing of the data graph
>>>>>>>                  to include the
>>>>>>>                  >superclasses, then type is precisely as it is
>>>>>>>                  defined in RDF - there are
>>>>>>>                  >just more type statements in the graph.
>>>>>>>                  >
>>>>>>>                  >As stated, this is quite an expansion of the
>>>>>>>                  meaning of type. In
>>>>>>>                  >addition, it appears to require modifications to
>>>>>>>                  the data graph to
>>>>>>>                  >include the super classes of each class (presumably
>>>>>>>                  up to and including
>>>>>>>                  >rdfs:Resource).
>>>>>>>                  >
>>>>>>>                  >I think it would be best if SHACL defined the shape
>>>>>>>                  and data graphs as
>>>>>>>                  >immutable, thus expecting that all operations read
>>>>>>>                  but do not modify the
>>>>>>>                  >graphs. I thought we had come to that conclusion.
>>>>>>>                  >
>>>>>>>                  >kc
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>              --
>>>>>>>              -Tom Johnson
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>          --
>>>>>>          -Tom Johnson
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>      --
>>>>>      -Tom Johnson
>>>>>
>>>>
>>>>
>>>>
>>> --
>>> Karen Coyle
>>> kcoyle@kcoyle.net http://kcoyle.net
>>> m: 1-510-435-8234
>>> skype: kcoylenet/+1-510-984-3600
>>>
>>>
>>
> --
> Karen Coyle
> kcoyle@kcoyle.net http://kcoyle.net
> m: 1-510-435-8234
> skype: kcoylenet/+1-510-984-3600
>
>


-- 
-Tom Johnson
Received on Tuesday, 10 May 2016 18:44:14 UTC