Re: Reasoning with ontologies and knowledge graphs?

I want to pick up on the example that Margaret gave, below, to make some points about inference. Basically, even simple 'basic' inferences (like following subclass chains) are put at risk by the poor quality of the available data, especially at the more abstract layers that one would hope might be deserving of the title 'ontology'.

On Dec 12, 2021, at 2:46 PM, Margaret Warren <mm@zeroexp.com<mailto:mm@zeroexp.com>> wrote:

...Our search paths function is also quite revealing about oddities that come up in places like the subclasses used in Wikidata for example - when you can do things like get a result of an image of a Bay in New Zealand for a search for a term like: 'communication medium'

The hops returned are as follows:

http://dbpedia.org/resource/New_Zealand  (sameAs)
Wikidata: New Zealand http://www.wikidata.org/entity/Q664 a Commonwealth realm http://www.wikidata.org/entity/Q202686
subclass of kingdom http://www.wikidata.org/entity/Q417175 subclass of monarchy http://www.wikidata.org/entity/Q7269
subclass of monarchic system http://www.wikidata.org/entity/Q22676587 subclass of form of government http://www.wikidata.org/entity/Q1307214
subclass of administrative type http://www.wikidata.org/entity/Q2752458 subclass of classification system http://www.wikidata.org/entity/Q5962346
subclass of knowledge organization system http://www.wikidata.org/entity/Q6423319 subclass of communication medium http://www.wikidata.org/entity/Q340169

So New Zealand is a communication medium. Hmmm.

Leaving aside some of the factually doubtful claims here (such as Commonwealth Realms being a subclass of Monarchies), the main problem seems to be that the meaning of 'monarchy' shifts from being a class of countries to being a type of government system. New Zealand is at least the right kind of thing to be an instance of the first sense of 'monarchy', but only the second sense can be asserted to be a monarchic system, and then surely it is an /instance/ of such a system, not a subclass, so it is again an instance, not a subclass, of a 'form of government'.

This could have been fixed by distinguishing between the actual country and its form of government, eg by saying that New Zealand hasGovernmentalSystem Monarchy, thus breaking the subclass chain (and allowing the first 'class' sense to be defined as on OWL restriction on the value of the property, in a decent ontology.) But going up from there feels like getting lost in a conceptual fog. Perhaps forms of government are administrative types, although I would have no real sense of why; but surely a form of government is not a knowledge organization system? (What does this even mean? What would make anyone feel that this assertion was required or useful, let alone true? Some very abstract theory of types of 'system', perhaps?) And then the final piece of insanity has 'communication medium' as the most overarching concept in this hypthetical theory of government semiotics. Really? Even if classifications and knowledge organizations are both forms of communication (a doubtful claim), surely they are not communication /media/.

All this stuff at a higher level than 'form of government' is largely meaningless, not the slightest use to anyone, and potentially dangerous. And even just two levels above something as concrete as New Zealand, we have isa/subclass confusions, which seem to be ubiquitous. .

Wikidata is one of the best curated large-scale linked-data corpora, yet it contains stuff like this, so that even inferences as simple and basic as running up subclass chains is liable to result in nonsense. Maybe we would be better off NOT doing too much inference.

Margaret was gracious enough to add that Wikidata is not always as bad as this and often gives great results. And yes, OK, but how do our inference engines avoid the bad stuff and only use the good?

Pat Hayes

Received on Tuesday, 14 December 2021 02:23:47 UTC