Re: rdfs:domain and refs:range in schema.org from Martynas Jusevičius on 2016-11-26 (public-schemaorg@w3.org from November 2016)

From: Martynas Jusevičius <martynas@graphity.org>
Date: Sat, 26 Nov 2016 23:40:19 +0100
To: Holger Knublauch <holger@topquadrant.com>
Cc: "schema. org Mailing List" <public-schemaorg@w3.org>
Message-ID: <CAE35VmwMmAhBhvpitz_yjXafdR8DyH0ur848VLTPjhdCuWgzCw@mail.gmail.com>
This might be true for schema.org which is not a true ontology but a
"lightweight vocabulary". But in general replacing RDFS and OWL with
SHACL means throwing inference and reasoners out of the window. I
think quite a few projects use RDFS reasoning in practice, including
our software.

On Sat, Nov 26, 2016 at 11:30 PM, Holger Knublauch
<holger@topquadrant.com> wrote:
> IMHO neither rdfs:domain nor schema:domainIncludes are ideal for schema.org.
> The whole notion of "global" property axioms is questionable. schema.org is
> class-centric and supposed to grow. To support its growth, properties should
> be attached locally to classes, in OO style.
>
> rdfs:domain is a global axiom that can be used to infer types in cases where
> no type can be derived from the given instance. In that case, looking up the
> URL of the property itself is a suitable strategy, and the URL would deliver
> the rdfs:domain statement. schema:domainIncludes seems to follow this
> pattern, but without providing particularly useful information.
>
> Even the global property labels and comments are rather unhelpful, because
> they need to cover all use cases of the property. However, in many cases
> where properties are shared between classes, they in fact should have
> different labels and comments. Picking a random example:
>
>     http://schema.org/creator
>
> which states "The creator/author of this CreativeWork." but its domain
> includes CreativeWork and UserComments. Quite likely this property started
> as a property of CreativeWork and then somebody decided they also need some
> kind of "creator" and the English term produced an overlap. The first use of
> the property should not limit future uses.
>
> IMHO a better way of associating properties with classes would be using
> something like SHACL. In the case of schema:creator this could look like
>
> schema:CreativeWork
>     a rdfs:Class ;
>     sh:property [
>         sh:predicate schema:creator ;
>         sh:description "The creator/author of this CreativeWork." ;
>         ... other constraints in the context of CreativeWork
>     ] ;
>     ...
>
> schema:UserComments
>     a rdfs:Class ;
>     sh:property [
>         sh:predicate schema:creator ;
>         sh:description "The creator of this comments." ;
>         ... other constraints in the context of UserComments
>     ] ;
>     ...
>
> This allows any future class to reuse the property without having to update
> a global definition that other applications may have gotten to rely on.
> Furthermore, it allows for class-specific constraints and definitions, e.g.
> properties may get different datatypes or cardinalities.
>
> And, I would recommend against going back to rdfs:domain for schema.org.
> Almost nobody in practice understands its semantics.
>
> Regards,
> Holger
>
>
>
>
> On 25/11/2016 21:28, Dan Brickley wrote:
>>
>> On 24 November 2016 at 04:05, Phil Archer <phila@w3.org> wrote:
>>>
>>> This presents either a problem or an opportunity (and I'd like to know
>>> which
>>> is true).
>>>
>>> The opportunity presented by "domainIncludes" is that you can, I think,
>>> use
>>> a property on a class that is not listed as a domain. In something I'm
>>> doing
>>> right now for the European Commission, I want to use schema:openingHours
>>> on
>>> a schema:ContactPoint. Since the domain of schema:openingHours 'includes'
>>> CivicStructure and LocalBusiness, perhaps that's OK? After all,
>>> 'includes'
>>> suggests it's not an exhaustive list. schema:ContactPoint's suggested
>>> schema:hoursAvailable property leads to a more complex
>>> schema:OpeningHoursSpecification that is useful for declaring exceptions
>>> -
>>> and we want to use that too - but it seems overly complex for a simple
>>> "usually open Monday to Friday 9 - 5" statement.
>>>
>>> So here, domainIncludes, as explained by Dan, wins.
>>>
>>> But... Martin's example shows that's *not* how it's being used. Rather,
>>> it's
>>> being used as a constraint language, which I regard as a separate thing
>>> altogether.
>>
>> Martin wrote "instead of throwing a constraint violation error."; I'd
>> suggest this should just be a warning that the use is potentially an
>> obscure, new or niche usage and that consequently it might not be
>> widely understood.
>>
>>> If I put a schema:openingHours property on a schema:ContactPoint, the
>>> structured data tester will say it doesn't understand my data.
>>
>> *the* ? There are several, e.g. Gregg's, Google's
>> (http://developers.google.com/structured-data/testing-tool)
>>
>> I would say that Google's structured data testing tool (SDTT) is
>> somewhat too strict in its tone, and too needy in its requirements,
>> for my taste.
>>
>> Compare to the most recent language on validation in
>> http://schema.org/docs/datamodel.html under "Conformance". This
>> elaborates on schema.org's longstanding and pretty tolerant approach
>> to conformance.
>>
>>> Does that
>>> mean my data is invalid for all potential data consumers or just the
>>> search
>>> engines?
>>
>> No, neither.
>>
>>> If the data is actually invalid then I'd say that rangeIncludes and
>>> domainIncludes seem to be mis-named. "domainResterictedToOnly" seems more
>>> honest? Or am I missing something?
>>
>> Schema.org's domainIncludes and rangeIncludes are pretty weak by
>> design. It might be that in many cases we could comfortably enough
>> assert rdfs:domain and rdfs:range too. I'm not sure that would add a
>> great deal of value, and when you look at the kinds of mistakes and
>> errors commonly made in real world data they're often invisible at
>> this level of data analysis anyway...
>>
>> Dan
>>
>>> Phil
>>>
>>>
>>> ==Dan's reply copied from archive for reference ==
>>>
>>> We wanted to leave the flexibility to evolve the schemas incrementally
>>> without breaking "promises" expressed with RDFS's range/domain, and
>>> without
>>> adding lots of artificial supertypes to group different types within a
>>> common type.
>>>
>>>
>>>
>>> == Martin's reply Copied from archive for reference ==
>>>
>>> Hi Alex:
>>>
>>> This is because the semantics of RDFS domain and range constructs *imply*
>>> additional type membership instead of *constraining* the applicability of
>>> a
>>> property to a class or value.
>>>
>>> With RDFS semantics, a domain spec like so
>>>
>>>      foo:schoolAttended rdfs:domain foo:Human.
>>>
>>> in combination with the statement
>>>
>>>      foo:myDog a foo:Dog ;
>>>                foo:schoolAttended "ACME High School".
>>>
>>> implies that
>>>
>>>      foo:myDog a foo:Human
>>>
>>> instead of throwing a constraint violation error.
>>>
>>>
>>> Also, if a property had multiple classes as its range or domain, you have
>>> to
>>> create many useless complex classes in order to avoid unintended type
>>> membership inferences:
>>>
>>> In RDFS, a domain spec like so
>>>
>>>      foo:yearOfBirth rdfs:domain foo:Human, foo:Dog.
>>>
>>> in combination with the statement
>>>
>>>      foo:myDog a foo:Dog ;
>>>                foo:yearOfBirth 1971.
>>>
>>> implies that your dog is a dog and a human:
>>>
>>>      foo:myDog a foo:Human, foo:Dog.
>>>
>>> i.e. the intersection of being a dog and human, whatever that is.
>>>
>>> The only way to avoid this are complex class definitions, like so:
>>>
>>>       foo:yearOfBirth rdfs:domain [ a owl:Class;
>>>                                       owl:unionOf (foo:Human, foo:Dog) ].
>>>
>>> which will create many, many of those useless classes in the ontology
>>> because of combinatorial effects.
>>>
>>> Martin
>>>
>>> -----------------------------------
>>> martin hepp  http://www.heppnetz.de
>>> mhepp@computer.org          @mfhepp
>>>
>>>
>>>
>>>
>>>> On 21 Nov 2016, at 16:39, Alex Prut <mail@alexprut.com> wrote:
>>>>
>>>> Hello all,
>>>> I'm looking at the schema.org raw ontology implementation and
>>>> documentation, but I can’t find a reason why the ontology was
>>>> implemented
>>>> using the schema:domainIncludes and schema:rangeIncludes properties,
>>>> instead
>>>> of the standard RDFs rdfs:domain and rdfs:range?
>>>> Thanks,
>>>> Alexandru Pruteanu (M.Sc. in Computer Science at University of Udine)
>>>> mail@alexprut.com
>>>>
>>> --
>>>
>>>
>>> Phil Archer
>>> Data Strategist, W3C
>>> http://www.w3.org/
>>>
>>> http://philarcher.org
>>> +44 (0)7887 767755
>>> @philarcher1
>>>
>
>
Received on Saturday, 26 November 2016 22:40:55 UTC