Re: rdfs:domain and refs:range in schema.org from Holger Knublauch on 2016-11-26 (public-schemaorg@w3.org from November 2016)

From: Holger Knublauch <holger@topquadrant.com>
Date: Sun, 27 Nov 2016 08:46:39 +1000
To: Martynas Jusevičius <martynas@graphity.org>
Cc: "schema. org Mailing List" <public-schemaorg@w3.org>
Message-ID: <8b269884-d9b5-44cd-eba8-d76931b11dbd@topquadrant.com>
On 27/11/2016 8:40, Martynas Jusevičius wrote:
> This might be true for schema.org which is not a true ontology but a
> "lightweight vocabulary". But in general replacing RDFS and OWL with
> SHACL means throwing inference and reasoners out of the window. I
> think quite a few projects use RDFS reasoning in practice, including
> our software.

Yes I am only referring to schema.org here. In general, SHACL is 
designed so that people can combine it with RDFS or other languages. But 
in the case of schema.org, rdfs:domain would again enforce global axioms 
and "close off" certain use cases.

Holger


>
> On Sat, Nov 26, 2016 at 11:30 PM, Holger Knublauch
> <holger@topquadrant.com> wrote:
>> IMHO neither rdfs:domain nor schema:domainIncludes are ideal for schema.org.
>> The whole notion of "global" property axioms is questionable. schema.org is
>> class-centric and supposed to grow. To support its growth, properties should
>> be attached locally to classes, in OO style.
>>
>> rdfs:domain is a global axiom that can be used to infer types in cases where
>> no type can be derived from the given instance. In that case, looking up the
>> URL of the property itself is a suitable strategy, and the URL would deliver
>> the rdfs:domain statement. schema:domainIncludes seems to follow this
>> pattern, but without providing particularly useful information.
>>
>> Even the global property labels and comments are rather unhelpful, because
>> they need to cover all use cases of the property. However, in many cases
>> where properties are shared between classes, they in fact should have
>> different labels and comments. Picking a random example:
>>
>>      http://schema.org/creator
>>
>> which states "The creator/author of this CreativeWork." but its domain
>> includes CreativeWork and UserComments. Quite likely this property started
>> as a property of CreativeWork and then somebody decided they also need some
>> kind of "creator" and the English term produced an overlap. The first use of
>> the property should not limit future uses.
>>
>> IMHO a better way of associating properties with classes would be using
>> something like SHACL. In the case of schema:creator this could look like
>>
>> schema:CreativeWork
>>      a rdfs:Class ;
>>      sh:property [
>>          sh:predicate schema:creator ;
>>          sh:description "The creator/author of this CreativeWork." ;
>>          ... other constraints in the context of CreativeWork
>>      ] ;
>>      ...
>>
>> schema:UserComments
>>      a rdfs:Class ;
>>      sh:property [
>>          sh:predicate schema:creator ;
>>          sh:description "The creator of this comments." ;
>>          ... other constraints in the context of UserComments
>>      ] ;
>>      ...
>>
>> This allows any future class to reuse the property without having to update
>> a global definition that other applications may have gotten to rely on.
>> Furthermore, it allows for class-specific constraints and definitions, e.g.
>> properties may get different datatypes or cardinalities.
>>
>> And, I would recommend against going back to rdfs:domain for schema.org.
>> Almost nobody in practice understands its semantics.
>>
>> Regards,
>> Holger
>>
>>
>>
>>
>> On 25/11/2016 21:28, Dan Brickley wrote:
>>> On 24 November 2016 at 04:05, Phil Archer <phila@w3.org> wrote:
>>>> This presents either a problem or an opportunity (and I'd like to know
>>>> which
>>>> is true).
>>>>
>>>> The opportunity presented by "domainIncludes" is that you can, I think,
>>>> use
>>>> a property on a class that is not listed as a domain. In something I'm
>>>> doing
>>>> right now for the European Commission, I want to use schema:openingHours
>>>> on
>>>> a schema:ContactPoint. Since the domain of schema:openingHours 'includes'
>>>> CivicStructure and LocalBusiness, perhaps that's OK? After all,
>>>> 'includes'
>>>> suggests it's not an exhaustive list. schema:ContactPoint's suggested
>>>> schema:hoursAvailable property leads to a more complex
>>>> schema:OpeningHoursSpecification that is useful for declaring exceptions
>>>> -
>>>> and we want to use that too - but it seems overly complex for a simple
>>>> "usually open Monday to Friday 9 - 5" statement.
>>>>
>>>> So here, domainIncludes, as explained by Dan, wins.
>>>>
>>>> But... Martin's example shows that's *not* how it's being used. Rather,
>>>> it's
>>>> being used as a constraint language, which I regard as a separate thing
>>>> altogether.
>>> Martin wrote "instead of throwing a constraint violation error."; I'd
>>> suggest this should just be a warning that the use is potentially an
>>> obscure, new or niche usage and that consequently it might not be
>>> widely understood.
>>>
>>>> If I put a schema:openingHours property on a schema:ContactPoint, the
>>>> structured data tester will say it doesn't understand my data.
>>> *the* ? There are several, e.g. Gregg's, Google's
>>> (http://developers.google.com/structured-data/testing-tool)
>>>
>>> I would say that Google's structured data testing tool (SDTT) is
>>> somewhat too strict in its tone, and too needy in its requirements,
>>> for my taste.
>>>
>>> Compare to the most recent language on validation in
>>> http://schema.org/docs/datamodel.html under "Conformance". This
>>> elaborates on schema.org's longstanding and pretty tolerant approach
>>> to conformance.
>>>
>>>> Does that
>>>> mean my data is invalid for all potential data consumers or just the
>>>> search
>>>> engines?
>>> No, neither.
>>>
>>>> If the data is actually invalid then I'd say that rangeIncludes and
>>>> domainIncludes seem to be mis-named. "domainResterictedToOnly" seems more
>>>> honest? Or am I missing something?
>>> Schema.org's domainIncludes and rangeIncludes are pretty weak by
>>> design. It might be that in many cases we could comfortably enough
>>> assert rdfs:domain and rdfs:range too. I'm not sure that would add a
>>> great deal of value, and when you look at the kinds of mistakes and
>>> errors commonly made in real world data they're often invisible at
>>> this level of data analysis anyway...
>>>
>>> Dan
>>>
>>>> Phil
>>>>
>>>>
>>>> ==Dan's reply copied from archive for reference ==
>>>>
>>>> We wanted to leave the flexibility to evolve the schemas incrementally
>>>> without breaking "promises" expressed with RDFS's range/domain, and
>>>> without
>>>> adding lots of artificial supertypes to group different types within a
>>>> common type.
>>>>
>>>>
>>>>
>>>> == Martin's reply Copied from archive for reference ==
>>>>
>>>> Hi Alex:
>>>>
>>>> This is because the semantics of RDFS domain and range constructs *imply*
>>>> additional type membership instead of *constraining* the applicability of
>>>> a
>>>> property to a class or value.
>>>>
>>>> With RDFS semantics, a domain spec like so
>>>>
>>>>       foo:schoolAttended rdfs:domain foo:Human.
>>>>
>>>> in combination with the statement
>>>>
>>>>       foo:myDog a foo:Dog ;
>>>>                 foo:schoolAttended "ACME High School".
>>>>
>>>> implies that
>>>>
>>>>       foo:myDog a foo:Human
>>>>
>>>> instead of throwing a constraint violation error.
>>>>
>>>>
>>>> Also, if a property had multiple classes as its range or domain, you have
>>>> to
>>>> create many useless complex classes in order to avoid unintended type
>>>> membership inferences:
>>>>
>>>> In RDFS, a domain spec like so
>>>>
>>>>       foo:yearOfBirth rdfs:domain foo:Human, foo:Dog.
>>>>
>>>> in combination with the statement
>>>>
>>>>       foo:myDog a foo:Dog ;
>>>>                 foo:yearOfBirth 1971.
>>>>
>>>> implies that your dog is a dog and a human:
>>>>
>>>>       foo:myDog a foo:Human, foo:Dog.
>>>>
>>>> i.e. the intersection of being a dog and human, whatever that is.
>>>>
>>>> The only way to avoid this are complex class definitions, like so:
>>>>
>>>>        foo:yearOfBirth rdfs:domain [ a owl:Class;
>>>>                                        owl:unionOf (foo:Human, foo:Dog) ].
>>>>
>>>> which will create many, many of those useless classes in the ontology
>>>> because of combinatorial effects.
>>>>
>>>> Martin
>>>>
>>>> -----------------------------------
>>>> martin hepp  http://www.heppnetz.de
>>>> mhepp@computer.org          @mfhepp
>>>>
>>>>
>>>>
>>>>
>>>>> On 21 Nov 2016, at 16:39, Alex Prut <mail@alexprut.com> wrote:
>>>>>
>>>>> Hello all,
>>>>> I'm looking at the schema.org raw ontology implementation and
>>>>> documentation, but I can’t find a reason why the ontology was
>>>>> implemented
>>>>> using the schema:domainIncludes and schema:rangeIncludes properties,
>>>>> instead
>>>>> of the standard RDFs rdfs:domain and rdfs:range?
>>>>> Thanks,
>>>>> Alexandru Pruteanu (M.Sc. in Computer Science at University of Udine)
>>>>> mail@alexprut.com
>>>>>
>>>> --
>>>>
>>>>
>>>> Phil Archer
>>>> Data Strategist, W3C
>>>> http://www.w3.org/
>>>>
>>>> http://philarcher.org
>>>> +44 (0)7887 767755
>>>> @philarcher1
>>>>
>>
Received on Saturday, 26 November 2016 22:47:17 UTC