Re: Using boolean value vs class from Michael F Uschold on 2019-09-25 (semantic-web@w3.org from September 2019)

From: Michael F Uschold <uschold@gmail.com>
Date: Tue, 24 Sep 2019 22:11:36 -0700
To: Hugh Glaser <hugh@glasers.org>
Cc: Antoine Zimmermann <antoine.zimmermann@emse.fr>, "semantic-web@w3.org" <semantic-web@w3.org>
Message-ID: <CADfiEMPE1ZvcaWf8JVb=W+uk6tgCbKaFVJ8AyFrw0_KtZia3TQ@mail.gmail.com>
This is an important discussion, as this modeling question arises all the
time.

I agree that Boolean data properties are not a great option. This is
explained in this blog:  Why Not to Use Boolean Datatypes in Taxonomies
<https://www.semanticarts.com/why-not-to-use-boolean-datatypes-in-taxonomies/>
by Dave McComb.

OWL inference may be a red herring here.  You may not be running OWL
inference over a large ABox of documents?  More likely, you are just going
to run inference on the TBox and then load triples into a triple store and
use whatever reasoning is provided by that vendor (highly variable, and
certainly not OWL2-DL).

Creating a class called Deprecated will work, but may not be the best
solution. First, it goes against common practice for naming a class. Common
names for classes include “Person” and “Document”. An instance of the first
class is a person. An instance of the second class is a document. However
an instance of your proposed class is not a ‘deprecated’. Rather it is a
deprecated thing.  If you named the class DeprecatedThing, the naming
convention would be respected. However, that’s not a very satisfying class.
The reason points to more fundamental issue, aside from naming.

The main purpose of an OWL class is to represent a set of things that are
all the same kind (person, document).  Nobody thinks of being deprecated as
signifying a different kind of thing.  It’s more analogous to a tag for a
photo.  If you tag a photo with “winter”, this gives rise to a set of
things tagged with “winter”.   One could represents that set as an OWL
class, just like one could represent the class of all deprecated things
with the class DeprecatedThing.  But these sets do not represent a kind of
thing one would want to represent as an OWL class.  Rather, being
deprecated or not is a  characteristic or facet of a thing. Documents and
products and lot of things can have many facets.

There is a third alternative that we use in our enterprise ontologies.  I
would create a class called say DeprecationIndicator with two instances:
isDeprecated and isNotDeprecated.   These are really categories:  something
is deprecated or not.  There are typically many such facets and each has a
set of values.   There might be a facet called Color for cars or iPhones.  An
individual car or phone would have a color and there could be several
instances of the class Color (rose, midnight green, etc).  An advantage of
this approach is that you avoid unnecessary proliferation of properties,
one for each facet. You do not need two properties one for hasColor and one
for hasDeprecationIndictor. Rather you can just use a single property, say
isCategorizedBy.  This is further explained in this blog: Buckets, Buckets
Everywhere, Who Knows What to Think?
<https://www.semanticarts.com/gist-buckets-buckets-everywhere-who-knows-what-to-think/>
by yours truly.



On Tue, Sep 24, 2019 at 7:03 AM Hugh Glaser <hugh@glasers.org> wrote:

> Very interesting question, thanks - it helps me explore my understanding.
> Sorry - as I have said, I'm not really very good on this stuff, but I do
> like to try to understand.
>
> Antoine, some of what you say puzzles me.
> Looking at class :Deprecated
> > The second model with a class :Deprecated ensures that an entity is
> either of type :Deprecated, or not.
> Is it not more properly the case that an Entity is either of type
> :Deprecated or we don't know? (Open world)
>
> So the boolean version seems to perhaps give me a richer way of recording
> knowledge.
>
> To model the boolean equivalent, you could also have a :notDeprecated
> class.
> And then you would have the same four categories for the class version as
> you have for the boolean version.
> (Not saying this is good!)
>
> [Hang on - I have just realised that Mikael makes no suggestion that he
> will ever assert "false" - so your introducing the "false" categories (3 &
> 4) is like me introducing the :notDeprecated class.]
>
> Although I worry about your argument here, I think that the general
> principle may well be very good.
> If you see booleans, especially where they always seem to be "true", it is
> a flag that maybe a class should be used.
> (This is very similar to seeing "= true" in an expression in a programming
> language, someone isn't thinking right :-) )
>
> I usually view an rdf:type triple as nothing special compared with any
> other.
> You assert them and match them just the same.
> It just so happens that "we" have chosen that we can do sub-classing, and
> so if we do that, we get some special magic that can happen, which doesn't
> happen with everything else.
> And that is sometimes very useful, although it can make things quite hard
> to get the hang of.
>
> Then, as you say, there are a whole bunch of practical questions about
> efficiency of stores and reasoners when you do things in different ways.
> But, as with programming, most efficiency things should be left to the
> system implementation, and the source should be modelled in the most
> understandable and maintainable way.
>
> Best
> Hugh
>
> > On 24 Sep 2019, at 13:48, Antoine Zimmermann <antoine.zimmermann@emse.fr>
> wrote:
> >
> > Mikael,
> >
> >
> > These two options definitely affects reasoning.
> >
> > If you have a property :isDeprecated, then any entity can fall into 4
> disjoint categories:
> >
> > 1. The entities that have no value for :isDeprecated.
> > 2. The entities that have value "true" only.
> > 3. The entities that have value "false" only.
> > 4. The entities that have both values "true" and "false".
> >
> > Moreover, if the range of the property is unrestricted, it can have all
> sorts of literal values, in any combination.
> >
> > If you want to make sure that all entities have exactly one of "true" or
> "false" as value for :isDeprecated, you need to introduce a cardinality
> axiom, which increases the complexity of reasoning (and you need to find a
> reasoner that supports cardinality restrictions on datatype properties).
> >
> > The second model with a class :Deprecated ensures that an entity is
> either of type :Deprecated, or not. This comes for free with any reasoner
> that supports a logic as simple as RDFS, without extra axioms. Many more
> reasoners support axioms made on classes than axioms made on literals and
> datatype properties. It's easier to define subclasses of deprecated
> documents, for instance.
> >
> > In general, when I review an ontology document, I mark all use of
> boolean properties as a mistake. Usually, boolean properties comes from
> adopting a programming approach to ontology engineering rather than a
> knowledge representation approach (that is, it uses the ontology as a data
> structure for computation rather than as an information model about the
> world, for knowledge interchange).
> >
> > However, when you have to go back and forth between an existing data
> model such as tabular data etc. and RDF, it can be convenient to translate
> booleans to booleans, so there can be exceptions to my rule of thumb of
> excluding all boolean properties.
> >
> >
> > Best,
> > --AZ
> >
> > Le 24/09/2019 à 13:57, Mikael Pesonen a écrit :
> >> Hi,
> >> lets say we have documents and we want to say wheather they are valid
> or deprecated. There are two ways to do this:
> >> :doc1 a foaf:Document ;
> >>     :isDeprecated "true"^^xsd:boolean .
> >> or
> >> :doc1 a foaf:Document ;
> >>     a :Deprecated .
> >> Are there some different implications on the use? Does is affect OWL
> reasoning, for example?
> >> Mikael
> >
> > --
> > Antoine Zimmermann
> > Institut Henri Fayol
> > École des Mines de Saint-Étienne
> > 158 cours Fauriel
> > CS 62362
> > 42023 Saint-Étienne Cedex 2
> > France
> > Tél:+33(0)4 77 42 66 03
> > Fax:+33(0)4 77 42 66 66
> > http://www.emse.fr/~zimmermann/
> > Member of team Connected Intelligence, Laboratoire Hubert Curien
> >
>
> --
> Hugh
> 023 8061 5652
>
>
>

-- 

Michael Uschold
   Senior Ontology Consultant, Semantic Arts
   http://www.semanticarts.com
   LinkedIn: www.linkedin.com/in/michaeluschold
   Skype, Twitter: UscholdM
Received on Wednesday, 25 September 2019 05:12:37 UTC