Re: rdfs:domain and refs:range in schema.org from Dan Brickley on 2016-11-25 (public-schemaorg@w3.org from November 2016)

From: Dan Brickley <danbri@google.com>
Date: Fri, 25 Nov 2016 03:28:18 -0800
To: Phil Archer <phila@w3.org>, "Martin Hepp (Google Docs)" <mfhepp@gmail.com>
Cc: "schema.org Mailing List" <public-schemaorg@w3.org>
Message-ID: <CAK-qy=5svAK6c0W0Cq=YC73oQLdcPtA-Mp06QU2R0vGpGK+RXQ@mail.gmail.com>
On 24 November 2016 at 04:05, Phil Archer <phila@w3.org> wrote:
> This presents either a problem or an opportunity (and I'd like to know which
> is true).
>
> The opportunity presented by "domainIncludes" is that you can, I think, use
> a property on a class that is not listed as a domain. In something I'm doing
> right now for the European Commission, I want to use schema:openingHours on
> a schema:ContactPoint. Since the domain of schema:openingHours 'includes'
> CivicStructure and LocalBusiness, perhaps that's OK? After all, 'includes'
> suggests it's not an exhaustive list. schema:ContactPoint's suggested
> schema:hoursAvailable property leads to a more complex
> schema:OpeningHoursSpecification that is useful for declaring exceptions -
> and we want to use that too - but it seems overly complex for a simple
> "usually open Monday to Friday 9 - 5" statement.
>
> So here, domainIncludes, as explained by Dan, wins.
>
> But... Martin's example shows that's *not* how it's being used. Rather, it's
> being used as a constraint language, which I regard as a separate thing
> altogether.

Martin wrote "instead of throwing a constraint violation error."; I'd
suggest this should just be a warning that the use is potentially an
obscure, new or niche usage and that consequently it might not be
widely understood.

> If I put a schema:openingHours property on a schema:ContactPoint, the
> structured data tester will say it doesn't understand my data.

*the* ? There are several, e.g. Gregg's, Google's
(http://developers.google.com/structured-data/testing-tool)

I would say that Google's structured data testing tool (SDTT) is
somewhat too strict in its tone, and too needy in its requirements,
for my taste.

Compare to the most recent language on validation in
http://schema.org/docs/datamodel.html under "Conformance". This
elaborates on schema.org's longstanding and pretty tolerant approach
to conformance.

> Does that
> mean my data is invalid for all potential data consumers or just the search
> engines?

No, neither.

> If the data is actually invalid then I'd say that rangeIncludes and
> domainIncludes seem to be mis-named. "domainResterictedToOnly" seems more
> honest? Or am I missing something?

Schema.org's domainIncludes and rangeIncludes are pretty weak by
design. It might be that in many cases we could comfortably enough
assert rdfs:domain and rdfs:range too. I'm not sure that would add a
great deal of value, and when you look at the kinds of mistakes and
errors commonly made in real world data they're often invisible at
this level of data analysis anyway...

Dan

> Phil
>
>
> ==Dan's reply copied from archive for reference ==
>
> We wanted to leave the flexibility to evolve the schemas incrementally
> without breaking "promises" expressed with RDFS's range/domain, and without
> adding lots of artificial supertypes to group different types within a
> common type.
>
>
>
> == Martin's reply Copied from archive for reference ==
>
> Hi Alex:
>
> This is because the semantics of RDFS domain and range constructs *imply*
> additional type membership instead of *constraining* the applicability of a
> property to a class or value.
>
> With RDFS semantics, a domain spec like so
>
>     foo:schoolAttended rdfs:domain foo:Human.
>
> in combination with the statement
>
>     foo:myDog a foo:Dog ;
>               foo:schoolAttended "ACME High School".
>
> implies that
>
>     foo:myDog a foo:Human
>
> instead of throwing a constraint violation error.
>
>
> Also, if a property had multiple classes as its range or domain, you have to
> create many useless complex classes in order to avoid unintended type
> membership inferences:
>
> In RDFS, a domain spec like so
>
>     foo:yearOfBirth rdfs:domain foo:Human, foo:Dog.
>
> in combination with the statement
>
>     foo:myDog a foo:Dog ;
>               foo:yearOfBirth 1971.
>
> implies that your dog is a dog and a human:
>
>     foo:myDog a foo:Human, foo:Dog.
>
> i.e. the intersection of being a dog and human, whatever that is.
>
> The only way to avoid this are complex class definitions, like so:
>
>      foo:yearOfBirth rdfs:domain [ a owl:Class;
>                                      owl:unionOf (foo:Human, foo:Dog) ].
>
> which will create many, many of those useless classes in the ontology
> because of combinatorial effects.
>
> Martin
>
> -----------------------------------
> martin hepp  http://www.heppnetz.de
> mhepp@computer.org          @mfhepp
>
>
>
>
>> On 21 Nov 2016, at 16:39, Alex Prut <mail@alexprut.com> wrote:
>>
>> Hello all,
>> I'm looking at the schema.org raw ontology implementation and
>> documentation, but I can’t find a reason why the ontology was implemented
>> using the schema:domainIncludes and schema:rangeIncludes properties, instead
>> of the standard RDFs rdfs:domain and rdfs:range?
>> Thanks,
>> Alexandru Pruteanu (M.Sc. in Computer Science at University of Udine)
>> mail@alexprut.com
>>
> --
>
>
> Phil Archer
> Data Strategist, W3C
> http://www.w3.org/
>
> http://philarcher.org
> +44 (0)7887 767755
> @philarcher1
>
Received on Friday, 25 November 2016 11:28:52 UTC