Re: type and instance and subclass in SHACL documents from Tom Johnson on 2016-03-14 (public-data-shapes-wg@w3.org from March 2016)

From: Tom Johnson <johnson.tom@gmail.com>
Date: Sun, 13 Mar 2016 21:10:16 -0700
To: Irene Polikoff <irene@topquadrant.com>
Cc: "Peter F. Patel-Schneider" <pfpschneider@gmail.com>, Karen Coyle <kcoyle@kcoyle.net>, RDF Data Shapes Working Group <public-data-shapes-wg@w3.org>
Message-ID: <CAJeHiNGhb2fJ1OPH137ZGBhnBanqK5DbANEpApDaLZdarnDwcQ@mail.gmail.com>
Irene, I think we're pretty close to seeing eye-to-eye, here. Your summary
of SHACL vs. RDFS's understandings of class membership seems correct to me.
Still, I don't think the problem reduces to whether the RDF(S) axioms are
included.

The issue, as I see it, is that RDFS terminology is regularly used without
clear definitions. The usual definitions are (somewhat vaguely) disavowed
in Section 1.1 (and sometimes elsewhere), but guidance for how to evaluate
whether a resource counts as an "instance" or a "class" is only provided
for narrow cases. Further, Section 1.1 is explicit that the usage is not
uniform.

To drive the point home: some cases are a significantly muddy. See, for
example, Section 2[0]:

    shapes are instances of the class sh:Shape (or subclasses of sh:Shape).

Here, I think we mean to say an RDF(S) instance(?). Using the only
alternate definition I can find in SHACL, there would need to be a triple
of the pattern `?s rdf:type sh:shape` in the *data* graph for a resource to
be a shape. In either case, it's unclear whether subclass transitivity
applies here.

That these specialized, pattern driven concepts are semantically overloaded
with terms in common use places a burden on the reader/implementer to peel
apart one meaning from the other. My preferred solution would be to expand
Section 1.1 with formal guidance, mark exceptional cases in the text
clearly, and (ideally, but optionally) avoid RDF(S) language in normative
clauses (i.e. treat "class", "instance", etc... as different but
intuitively related concepts to those handled in SHACL evaluation).

The issue of whether axioms from RDF[1] and RDFS[2] can be assumed is
another source of conflicts. It seems somewhat clear to me that the
intention is that they should *never* be assumed; but as an implementer, I
wouldn't be totally confident. Section 1.1 clarifies that they aren't to be
used "When determining subclass and instance relationships". If RDFS
entailment is added in the way described in Section 12, do examples like
Peter's (ones that reference RDFS terms directly) evaluate differently?
Could this have unexpected consequences?

I agree that all of this could and should have been stated much more
clearly; it could have saved everyone (including myself) a lot of
intellectual labor in understanding the point, and the group several days
of back and forth.

Best,

Tom

[0] https://www.w3.org/TR/shacl/#h-shape
[1] https://www.w3.org/TR/rdf11-mt/#RDF_axiomatic_triples
[2] https://www.w3.org/TR/rdf11-mt/#RDFS_axiomatic_triples

On Sat, Mar 12, 2016 at 6:13 PM, Irene Polikoff <irene@topquadrant.com>
wrote:

> Tom,
>
> I am trying to understand what makes SHACL notion of the word instance to
> be "very different" from RDFS notion of the word instance.
>
> RDFS specification says:
>
> "Resources may be divided into groups called classes. The members of a
> class are known as *instances* of the class. Classes are themselves
> resources. They are often identified by IRIs
> <http://www.w3.org/TR/rdf11-concepts/#section-IRIs> and may be described
> using RDF properties. The rdf:type
> <https://www.w3.org/TR/rdf-schema/#ch_type> property may be used to state
> that a resource is an instance of a class.”
>
> SHACL relies on rdf:type property to determine if something is an instance
> of a class. If there is a difference here, it is not a very obvious
> difference, but rather a quite nuanced one. And there is certainly no
> contradiction of any kind. Any time SHACL considers X to be an instance of
> Y, so would RDFS. As SHACL relies on rdf:type triples, a SHACL engine may
> sometimes not recognize that a resource is an instance of a particular
> class when a fully RDFS aware program would. I believe when explained in
> such terms, the difference sounds clearer and much less alarming.
>
> RDFS specification prescribes class membership for the resources it
> defines. Such as "rdfs:range is an instance of rdf:Property
> <https://www.w3.org/TR/rdf-schema/#ch_property>”. RDFS vocabulary encodes
> these definitions with triples like “rdfs:range rdf:type rdf:Property”.
>
> If this whole controversy is about whether SHACL engines should act as if
> RDFS vocabulary triples  were always a part of the data graph when
> validation is performed, then such statement could have been made very
> plainly. I think it is worth recording this as an issue. Whatever the cost
> of doing such inclusion, it may be well worth it to avoid the confusion of
> pondering the subtleties of the word “instance”.
>
> Irene
>
>
> From: Tom Johnson <johnson.tom@gmail.com>
> Date: Saturday, March 12, 2016 at 5:17 PM
> To: Irene Polikoff <irene@topquadrant.com>
> Cc: "Peter F. Patel-Schneider" <pfpschneider@gmail.com>, Karen Coyle <
> kcoyle@kcoyle.net>, RDF Data Shapes Working Group <
> public-data-shapes-wg@w3.org>
> Subject: Re: type and instance and subclass in SHACL documents
>
> I suspect my email crossed streams with Irene's, but I want to answer the
> call for someone who sees the problem here directly: I do, quite clearly;
> at least with respect to the `sh:scopeClass` example.
>
> `sh:scopeClass`'s definition relies on RDFS terminology throughout. It
> would be reasonable for a reader of the spec to assume that the spec (and
> validation process) imports RDFS directly. The total impact of this is
> ambiguous to me, but it clearly creates a problem for terms that are
> defined directly the in RDF/RDFS. Text disavowing this *is* included, but
> is rather hidden--and, I think more to Peter's point, flies in the face of
> any claim that SHACL "conforms to RDFS".
>
> It seems to me that SHACL is on safer ground defining itself in terms of
> basic graph patterns evaluated on the data graph, and should avoid using
> RDFS terminology in its formal definitions. Such terminology should
> continue to be used in naming SHACL abstractions where it is deemed to be
> an aid to intuitive understanding (`sh:scopeClass` seems like a good
> example, here).
>
> - Tom
>
> On Sat, Mar 12, 2016 at 2:08 PM, Irene Polikoff <irene@topquadrant.com>
> wrote:
>
>>
>> Peter,
>>
>> Repeating that "SHACL instance is indeed very different from RDFS
>> instance” doesn’t move the conversation forward. The question was “why and
>> how” and you have not answered this question in a way that I could
>> understand.
>>
>> Since no one else in the working group jumped in to answer this question
>> and, on contrary, several people joined me in asking it, I have to
>> conclude that no one else understand this either. If I am wrong, and there
>> is someone other than Peter who does, please, answer it. If something is
>> indeed very different from another thing, such difference should be
>> apparent to most group members.
>>
>> As I recall, this topic has been brought up by Peter on and off during (at
>> least) the last 12 months. Realistically and practically speaking, if no
>> one else in the working group that is made of experts and experienced
>> practitioners understands this difference, even after a year of
>> discussions, I see this topic as having absolutely no practical relevance.
>> The chance that the broader community would understand it or care to
>> understand it or be impacted by it in any way whatsoever is infinitely
>> close to zero.
>>
>> I would venture even further and say that such unwavering focus on obscure
>> points that make no practical difference is they key obstacle to adoption
>> of RDF technology. As a community, we must overcome this tendency in order
>> to move forward.
>>
>> Irene
>>
>>
>>
>>
>> On 3/12/16, 4:30 PM, "Peter F. Patel-Schneider" <pfpschneider@gmail.com>
>> wrote:
>>
>> >The SHACL documents talk about instance.  If this is RDFS instance, then,
>> >yes,
>> >SHACL engines would always have to treat rdfs:label as an instance of
>> >rdf:Property.
>> >
>> >This is why I say that the SHACL documents should be very clear every
>> time
>> >that they talk about instance that it is not the common RDFS instance
>> that
>> >they are talking about but some new notion particular to SHACL,
>> >particularly
>> >as SHACL uses RDFS vocabulary.
>> >
>> >SHACL instance is indeed very different from RDFS instance.
>> >
>> >peter
>> >
>> >
>> >On 03/12/2016 10:05 AM, Irene Polikoff wrote:
>> >> We need rdf:type to know if something is an instance of a class (note
>> >>that I am saying simply 'instance' because I do not see the difference).
>> >>
>> >> If {rdfs:label rdf:type rdf:Property} triple was provided to a SHACL
>> >>engine, then the violation would be raised.
>> >>
>> >> How else could it be known from the data graph that rdfs:label is a
>> >>property? Or are you saying that SHACL engines should always include
>> >>triples in RDFS vocabulary when they do their processing?
>> >>
>> >> Sent from my iPhone
>> >>
>> >>> On Mar 11, 2016, at 10:36 PM, Peter F. Patel-Schneider
>> >>><pfpschneider@gmail.com> wrote:
>> >>>
>> >>> Using the RDFS definition of instance, rdfs:label is an instance of
>> >>> rdf:Property so it is in the scope of the shape and there is a
>> >>>violation.
>> >>> Using the SHACL definition of instance, rdfs:label is *not* an
>> >>>instance of
>> >>> rdf:Property so it is *not* in scope and there is *no* violation.
>> >>>
>> >>> peter
>> >>>
>> >>>
>> >>>> On 03/11/2016 04:50 PM, Karen Coyle wrote:
>> >>>> Peter, I admit that I, too, am having trouble understanding this.
>> >>>>(And so it
>> >>>> isn't all on Peter, if anyone else "gets it" maybe they could weight
>> >>>>in.) The
>> >>>> SHACL document uses the term "instance" 78 times. I admit I only
>> >>>>looked at the
>> >>>> first couple of dozen of those uses. For the most part they appear to
>> >>>>me to
>> >>>> conform to the RDFS definition of "instance" - meaning an instance of
>> >>>>class.
>> >>>> In some cases the term is used more colloquially, but those places in
>> >>>>the
>> >>>> document don't seem to be definitional.
>> >>>>
>> >>>> You say that it doesn't validate, but can you say what the difference
>> >>>>is in
>> >>>> the two definitions? I still see it as having to do with the
>> >>>>vocabulary
>> >>>> definition as opposed to the SHACL validation, but you didn't buy
>> >>>>that when I
>> >>>> suggested it. If I were to use a typical OWL-based validation,
>> >>>>rdfs:range
>> >>>> ex:label "range" would be flagged as inconsistent. The same would be
>> >>>>true if I
>> >>>> would have
>> >>>>  ex:someSubject dct:type "text" .
>> >>>> (dct:type has a range of rdf-schema#Class)
>> >>>>
>> >>>> If this isn't the issue, I would sure like to know what is.
>> >>>>
>> >>>> Thanks,
>> >>>> kc
>> >>>>
>> >>>>> On 3/11/16 2:22 PM, Peter F. Patel-Schneider wrote:
>> >>>>> The definition of SHACL depends on "instance".  This can be read to
>> >>>>>mean
>> >>>>> "RDFS instance" or "SHACL instance".  Under the former meaning the
>> >>>>>data graph
>> >>>>> does not validate against the shape.   Under the latter meaning the
>> >>>>>data graph
>> >>>>> does validate against the shape.
>> >>>>>
>> >>>>> peter
>> >>>>>
>> >>>>>
>> >>>>>> On 03/11/2016 02:15 PM, Irene Polikoff wrote:
>> >>>>>> I don¹t understand what you mean by
>> >>>>>>
>> >>>>>> "validates against this shape under SHACL instance but not under
>> >>>>>>RDFS
>> >>>>>> instance.²
>> >>>>>>
>> >>>>>> I am not able to parse the sentence.
>> >>>>>>
>> >>>>>> What are you doing? Taking a shape described and the graph
>> >>>>>>described and
>> >>>>>> running it against SHACL engine? What execution validates and what
>> >>>>>> execution doesn¹t validate?
>> >>>>>>
>> >>>>>>
>> >>>>>>
>> >>>>>> Irene
>> >>>>>>
>> >>>>>>
>> >>>>>>
>> >>>>>> On 3/11/16, 5:03 PM, "Peter F. Patel-Schneider"
>> >>>>>><pfpschneider@gmail.com>
>> >>>>>> wrote:
>> >>>>>>
>> >>>>>>>> On 03/11/2016 01:01 PM, Karen Coyle wrote:
>> >>>>>>>>
>> >>>>>>>>
>> >>>>>>>>> On 3/11/16 11:43 AM, Peter F. Patel-Schneider wrote:
>> >>>>>>>>> Consider the following shape (using obvious prefix declarations)
>> >>>>>>>>>
>> >>>>>>>>> sh:propertyShape a sh:Shape ;
>> >>>>>>>>>   sh:scopeClass rdf:Property ;
>> >>>>>>>>>   sh:property [ sh:predicate rdfs:label ;
>> >>>>>>>>>                 sh:minCount 1 ] .
>> >>>>>>>>>
>> >>>>>>>>> The data graph (using obvious prefix declarations)
>> >>>>>>>>>
>> >>>>>>>>> rdfs:range ex:label "range" .
>> >>>>>>>>>
>> >>>>>>>>> validates against this shape under SHACL instance but not under
>> >>>>>>>>>RDFS
>> >>>>>>>>> instance.
>> >>>>>>>>
>> >>>>>>>> Isn't this a problem with every vocabulary and not just RDFS? If
>> >>>>>>>>the
>> >>>>>>>> rules of
>> >>>>>>>> the vocabulary (such as domain and range) are not encoded as such
>> >>>>>>>>in
>> >>>>>>>> SHACL
>> >>>>>>>> then the SHACL result can be "in violation" of the vocabulary
>> >>>>>>>> definition.
>> >>>>>>>>
>> >>>>>>>> Now, if that is the case then I understand that violating the
>> >>>>>>>>foundation
>> >>>>>>>> vocabulary of RDF/RDFS may be more grave than violating a
>> >>>>>>>>user-developed
>> >>>>>>>> vocabulary, and in some cases doing the latter may indeed be the
>> >>>>>>>> intention of
>> >>>>>>>> the SHACL definition. So do we want to build into SHACL that it
>> >>>>>>>>must
>> >>>>>>>> follow
>> >>>>>>>> RDF/RDFS property and class definitions? And how feasible is
>> that?
>> >>>>>>>>
>> >>>>>>>> kc
>> >>>>>>>
>> >>>>>>> This is only a real problem because SHACL uses "instance" in its
>> >>>>>>> specification, this term is also used centrally in RDFS, and SHACL
>> >>>>>>>uses
>> >>>>>>> RDFS
>> >>>>>>> vocabulary.
>> >>>>>>>
>> >>>>>>> The question then is how to read "instance" in SHACL
>> >>>>>>>documentation, i.e.,
>> >>>>>>> how
>> >>>>>>> to prevent readers of the SHACL documentation from seeing "RDFS
>> >>>>>>>instance"
>> >>>>>>> where "SHACL instance" is meant.
>> >>>>>>>
>> >>>>>>>
>> >>>>>>> peter
>> >>>
>>
>>
>>
>>
>
>
> --
> -Tom Johnson
>



-- 
-Tom Johnson
Received on Monday, 14 March 2016 04:11:25 UTC