Concrete mixers from Simon Spero on 2015-01-02 (public-owl-dev@w3.org from January to March 2015)

From: Simon Spero <sesuncedu@gmail.com>
Date: Fri, 2 Jan 2015 11:54:45 -0500
To: public-owl-dev@w3.org
Message-ID: <CADE8KM78gyng3O7uLqKeNSkjrMG91EYgpwvYV9T_qGk2nGmKwA@mail.gmail.com>

Sorry about  the puns , but I literally just have a few questions about
what kinds things you can do if you were to have data and object properties
with the same name.

The motivating use case is building a bridge between schema.org (sdo) and
OWL.

sdo has a data model that is roughly close to rdfs. A major difference is
that instead of having domain and range statements, where multiple
assertions for the same property give an effective domain / range that is
the intersection of the assertions, sdo uses domainincludes and
rangeincludes, which combine to form a union.

These could be handled either by using an anonymous type directly or by
defining a named type.  The named type could either be a defined as an
equivalent class for the union, or by creating an artificial supertype.

Using the equivalent+union approach should give more useful results when
the hierarchy is computed (e.g  suggesting missing abstractions).

Using anonymous unions or subclassing may match the semantics of sdo more
closely, but the data model is too loosely defined to be sure.

Most properties are either purely literal valued, or purely object valued,
but some properties have ranges that are mixed.

In some cases this is historical accident resulting from having separately
developed, conceptually independent vocabularies existing in a flat
namespace.
In other cases the literal value is unstructured information that a
sufficiently smart entity extractor could convert into an object.
In a third set of situations, the literal value serves as an identifier -
for example, using the "descriptor(preferred label) " of a "concept" in a
"controlled vocabulary", which is required to be unambiguous. [quotes used
to indicate packed terms].

In the first case, splitting the merged property into separate data and
object properties on the way in and merging on the way out is the obvious
approach.

In the third case, generating synthetic objects for literal values on their
way in to the A-Box seems obvious - HasKey could be avoided with canonical
mappings to IRI. Being able to assert a property local UNA would be handy
here.

It is the second case that seems to present the most difficult problems.
There is only one conceptual property in play, and it would seem more
useful to have cardinality constraints apply to all. Using autoboxing seems
like the right thing, but could be confusing to explain (unless literal
values are always boxed, which is not a performance win).

The typical use case has some form of completeness lurking in the
background (probably #$completeExtentKnown or #$completeExtentAsserted).

Comments / Suggestions?

Simon

Received on Friday, 2 January 2015 16:55:11 UTC