Re: Using boolean value vs class from Patrick J Hayes on 2019-09-25 (semantic-web@w3.org from September 2019)

From: Patrick J Hayes <phayes@ihmc.us>
Date: Wed, 25 Sep 2019 01:23:37 -0500
To: Michael F Uschold <uschold@gmail.com>
CC: Hugh Glaser <hugh@glasers.org>, Antoine Zimmermann <antoine.zimmermann@emse.fr>, "semantic-web@w3.org" <semantic-web@w3.org>
Message-ID: <78A48345-A00B-4E43-A839-363D3CE6EA8A@ihmc.us>
Michael, I have some questions/comments, in-line below.

> On Sep 25, 2019, at 12:11 AM, Michael F Uschold <uschold@gmail.com> wrote:
> 
> This is an important discussion, as this modeling question arises all the time.
> 
> I agree that Boolean data properties are not a great option. This is explained in this blog:  Why Not to Use Boolean Datatypes in Taxonomies <https://www.semanticarts.com/why-not-to-use-boolean-datatypes-in-taxonomies/> by Dave McComb.
> 
> OWL inference may be a red herring here.  You may not be running OWL inference over a large ABox of documents?  More likely, you are just going to run inference on the TBox and then load triples into a triple store and use whatever reasoning is provided by that vendor (highly variable, and certainly not OWL2-DL). 
> 
> Creating a class called Deprecated will work, but may not be the best solution. First, it goes against common practice for naming a class. Common names for classes include “Person” and “Document”. An instance of the first class is a person. An instance of the second class is a document. However an instance of your proposed class is not a ‘deprecated’. Rather it is a deprecated thing.  If you named the class DeprecatedThing, the naming convention would be respected. However, that’s not a very satisfying class.  The reason points to more fundamental issue, aside from naming. 
> 
> The main purpose of an OWL class is to represent a set of things that are all the same kind (person, document). 
> 
Really? Where do you get this from? I don’t see anything in any OWL spec which implies this. OWL classes are just sets of things, and you can make a set for any purpose. 

> Nobody thinks of being deprecated as signifying a different kind of thing. It’s more analogous to a tag for a photo.  If you tag a photo with “winter”, this gives rise to a set of things tagged with “winter”.   One could represents that set as an OWL class, just like one could represent the class of all deprecated things with the class DeprecatedThing.  But these sets do not represent a kind of thing one would want to represent as an OWL class.  Rather, being deprecated or not is a  characteristic or facet of a thing.
> 
What is the difference between a characteristic of a thing and a class (of things with that characteristic) that the thing might be in? Can you characterize that distinction?

> Documents and products and lot of things can have many facets.
> 
And they can be in, or not in, many classes. 

Seems to me that you are imposing an alien, and unnecessary, intuition onto OWL which is not there in either the design philosophy or the specifications of the actual language. 
> There is a third alternative that we use in our enterprise ontologies.  I would create a class called say DeprecationIndicator with two instances: isDeprecated and isNotDeprecated.   These are really categories:  something is deprecated or not. 
> 
But by encoding what are really Booleans as values you are introducing the problems that Antoine describes. What is gained by this rather than simply having the classes? As you say, they are really categories, and surely a category is a best thought of as a class. 

> There are typically many such facets
> 
This whole language of facets seems like simply a re-statement of the OWL description-logic intuition in a different metalanguage. What is a ‘facet’ other than a property?

> and each has a set of values.   There might be a facet called Color for cars or iPhones.  An individual car or phone would have a color and there could be several instances of the class Color (rose, midnight green, etc).  
> 
Cars and iphones are in the domain of the hasColor property, and its values are things like rose, etc., in the class of Colors. Pure OWL; but I would use this basic structure in a language as expressive as Common Logic. 

> An advantage of this approach is that you avoid unnecessary proliferation of properties, one for each facet. You do not need two properties one for hasColor and one for hasDeprecationIndictor. Rather you can just use a single property, say isCategorizedBy. 
> 
That seems to me to be a problem rather than an advantage. Why does a color ‘categorize’ something? That violates my intuition rather sharply. But in any case, your isCategorizedBy is just a very high superproperty of all the more precise ‘characterizing’ properties, which could be defined as restrictions to particular range classes if one really wanted to do thing in such an opaque way, so (A hasColor rose) just means (A isCategorizedBy rose) & (Color rose). (In CL you could use the same name for the class and the property, just to keep things notationally simpler. I believe one can do the same kind of punning in OWL2,)

> This is further explained in this blog: Buckets, Buckets Everywhere, Who Knows What to Think? <https://www.semanticarts.com/gist-buckets-buckets-everywhere-who-knows-what-to-think/> by yours truly.
> 
Well, that says that you like to make these distinctions, but it doesn’t explain why, or what advantages might accrue from adopting this unintuitive discipline. 

Best wishes

Pat
> 
> 
> 
> On Tue, Sep 24, 2019 at 7:03 AM Hugh Glaser <hugh@glasers.org <mailto:hugh@glasers.org>> wrote:
> Very interesting question, thanks - it helps me explore my understanding.
> Sorry - as I have said, I'm not really very good on this stuff, but I do like to try to understand.
> 
> Antoine, some of what you say puzzles me.
> Looking at class :Deprecated
> > The second model with a class :Deprecated ensures that an entity is either of type :Deprecated, or not. 
> Is it not more properly the case that an Entity is either of type :Deprecated or we don't know? (Open world)
> 
> So the boolean version seems to perhaps give me a richer way of recording knowledge.
> 
> To model the boolean equivalent, you could also have a :notDeprecated class.
> And then you would have the same four categories for the class version as you have for the boolean version.
> (Not saying this is good!)
> 
> [Hang on - I have just realised that Mikael makes no suggestion that he will ever assert "false" - so your introducing the "false" categories (3 & 4) is like me introducing the :notDeprecated class.]
> 
> Although I worry about your argument here, I think that the general principle may well be very good.
> If you see booleans, especially where they always seem to be "true", it is a flag that maybe a class should be used.
> (This is very similar to seeing "= true" in an expression in a programming language, someone isn't thinking right :-) )
> 
> I usually view an rdf:type triple as nothing special compared with any other.
> You assert them and match them just the same.
> It just so happens that "we" have chosen that we can do sub-classing, and so if we do that, we get some special magic that can happen, which doesn't happen with everything else.
> And that is sometimes very useful, although it can make things quite hard to get the hang of.
> 
> Then, as you say, there are a whole bunch of practical questions about efficiency of stores and reasoners when you do things in different ways.
> But, as with programming, most efficiency things should be left to the system implementation, and the source should be modelled in the most understandable and maintainable way.
> 
> Best
> Hugh
> 
> > On 24 Sep 2019, at 13:48, Antoine Zimmermann <antoine.zimmermann@emse.fr <mailto:antoine.zimmermann@emse.fr>> wrote:
> > 
> > Mikael,
> > 
> > 
> > These two options definitely affects reasoning.
> > 
> > If you have a property :isDeprecated, then any entity can fall into 4 disjoint categories:
> > 
> > 1. The entities that have no value for :isDeprecated.
> > 2. The entities that have value "true" only.
> > 3. The entities that have value "false" only.
> > 4. The entities that have both values "true" and "false".
> > 
> > Moreover, if the range of the property is unrestricted, it can have all sorts of literal values, in any combination.
> > 
> > If you want to make sure that all entities have exactly one of "true" or "false" as value for :isDeprecated, you need to introduce a cardinality axiom, which increases the complexity of reasoning (and you need to find a reasoner that supports cardinality restrictions on datatype properties).
> > 
> > The second model with a class :Deprecated ensures that an entity is either of type :Deprecated, or not. This comes for free with any reasoner that supports a logic as simple as RDFS, without extra axioms. Many more reasoners support axioms made on classes than axioms made on literals and datatype properties. It's easier to define subclasses of deprecated documents, for instance.
> > 
> > In general, when I review an ontology document, I mark all use of boolean properties as a mistake. Usually, boolean properties comes from adopting a programming approach to ontology engineering rather than a knowledge representation approach (that is, it uses the ontology as a data structure for computation rather than as an information model about the world, for knowledge interchange).
> > 
> > However, when you have to go back and forth between an existing data model such as tabular data etc. and RDF, it can be convenient to translate booleans to booleans, so there can be exceptions to my rule of thumb of excluding all boolean properties.
> > 
> > 
> > Best,
> > --AZ
> > 
> > Le 24/09/2019 à 13:57, Mikael Pesonen a écrit :
> >> Hi,
> >> lets say we have documents and we want to say wheather they are valid or deprecated. There are two ways to do this:
> >> :doc1 a foaf:Document ;
> >>     :isDeprecated "true"^^xsd:boolean .
> >> or
> >> :doc1 a foaf:Document ;
> >>     a :Deprecated .
> >> Are there some different implications on the use? Does is affect OWL reasoning, for example?
> >> Mikael
> > 
> > -- 
> > Antoine Zimmermann
> > Institut Henri Fayol
> > École des Mines de Saint-Étienne
> > 158 cours Fauriel
> > CS 62362
> > 42023 Saint-Étienne Cedex 2
> > France
> > Tél:+33(0)4 77 42 66 03
> > Fax:+33(0)4 77 42 66 66
> > http://www.emse.fr/~zimmermann/ <http://www.emse.fr/~zimmermann/>
> > Member of team Connected Intelligence, Laboratoire Hubert Curien
> > 
> 
> -- 
> Hugh
> 023 8061 5652
> 
> 
> 
> 
> -- 
> Michael Uschold
>    Senior Ontology Consultant, Semantic Arts
>    http://www.semanticarts.com <http://www.semanticarts.com/>
>    LinkedIn: www.linkedin.com/in/michaeluschold <http://www.linkedin.com/in/michaeluschold>
>    Skype, Twitter: UscholdM
> 
> 
>
Received on Wednesday, 25 September 2019 06:24:06 UTC