Re: Schema.org and OWL

On Thu, 14 Jun 2018 at 15:19, Anthony Moretti <anthony.moretti@gmail.com>
wrote:

> I think Martin's point about passing information from product types to
> product instances can be addressed higher in the hierarchy than Product
> actually. I sense people are opposed to shifting properties from more
> specific types to Thing though (maybe I don't understand something, can
> someone please explain that to me?) My view is that using overly specific
> domains for properties causes strange entailment, e.g. in its current form
> the "height" property entails the subject is either a MediaObject, Person,
> Product, or VisualArtwork, which doesn't seem right.
>

On this point - "e.g. in its current form the "height" property entails the
subject is either a MediaObject, Person, Product, or VisualArtwork, which
doesn't seem right." -- we don't really say that anywhere, and in fact we
created looser variants of rdfs domain/range for documentation, to avoid
saying more than we wanted to. On the contrary, in
http://schema.org/docs/datamodel.html  -

"When we list the expected types associated with a property (or vice-versa)
we aim to indicate the main ways in which these terms will be combined in
practice. This aspect of schema.org is naturally imperfect. For example the
schemas for Volcano <http://schema.org/Volcano> suggest that since
volcanoes are places, they may have fax numbers. Similarly, we list the
unlikely (but not infeasible) possibility of a Country
<http://schema.org/Country> having "opening hours". We do not attempt to
perfect this aspect of schema.org's structure, and instead rely heavily on
an extensive collection of illustrative examples that capture common and
useful combinations of schema.org terms. The type/properties associations
of schema.org are closer to "guidelines" than to formal rules, and
improvements to the guidelines are always welcome
<https://www.w3.org/community/schemaorg/>."

In this regard, you might view this aspect of Schema.org as being closer to
the "The Code is more what you call guidelines, than actual rules
<https://www.youtube.com/watch?v=jl0hMfqNQ-g>" tradition of the Pirates of
the Caribbean than the expectations you might bring from the OWL world,
even if we target much the same underlying data model.

If this might seem less thank helpful, I'd suggest a possible middle-ground
would be to explore the RDF validation languages - SHACL and ShEx - which
suggest ways of layering certain kinds of discipline over messy RDF data.
It doesn't address all the modeling concerns raised here, but does offer
another layer of expressivity which needn't happen in the core project.
You could look at https://www.topquadrant.com/technology/shacl/tutorial/ or
http://book.validatingrdf.com/ -- e.g. http://datashapes.org/schema
attempts to capture some of schema.org itself in SHACL, whereas
https://github.com/SEMICeu/dcat-ap_shacl/ (in SHACL) and
https://github.com/SEMICeu/dcat-ap_shacl/issues/32 (in ShEx) try to capture
specific useful community-specific patterns for describing datasets. These
languages let people say things about Schema.org data structures, beyond
what the project itself chooses to say. For example by constructing and
documenting more tidy-minded subsets/profiles, or mixing it with longer
tail vocabularies (like Wikidata's e.g. see Thad and friends' mappings
<https://github.com/schemaorg/schemaorg/issues/280>) or richer domain
models e.g. from the sciences, and explaining sensible patterns for these
combinations. You could look at what the Blue Brain project are doing
there, for example -
https://github.com/BlueBrain/nexus-kg/issues?utf8=%E2%9C%93&q=is%3Aissue+is%3Aopen+shacl
or the ShEx efforts around HL7/FHIR,
https://www.hl7.org/fhir/medication.shex.html

That kind of perspective I think makes two points. One is that Schema.org's
modeling style and hierarchical structure is not the only place where
discipline can be exercised usefully; and the second is that more
"knowledge graphy" usecases (beyond simple Web markup) are likely to engage
with other vocabularies and systems (e.g. scientific domains or general
like Wikidata), in which case we're unlikely to see a unified modeling
style across it all, and will likely end up focussing - again - on
documenting usefully re-usable patterns that address particular situations.

Dan

Received on Thursday, 14 June 2018 23:09:54 UTC