NOT(:A owl:sameAs :A)

      I've been doing a reconciliation project between DBpedia,  Freebase,
Open Street Maps and a few other sources and that's gotten me thinking about
the practical and philosophic implications of terms.  In particular,  I've
been concerned with the construction of 'Vernacular' meanings that may not
be perfectly precise but would seem normal to people.

      One thing I've realized is that no term is atomic,  and a corollary of
that is that A != A in commonsense reasoning.

      Openmind Commonsense,  for instance, contains a number of english
sentences that make assertions about the term "Oxygen",  such as

"The earth's atmosphere contains Oxygen" (obviously true)
"The ocean contains Oxygen" (true in two different senses:   there is
dissolved diatomic oxygen in the ocean that fish can breathe;  however,
most of the mass of the ocean is oxygen atoms that are part of water
molecules.  However,  we can't breathe the oxygen in the ocean,  so I
couldn't berate somebody for answering 'no' to this question.)

       I felt uncomfortable confirming the truth about the above term,  but
it gets worse

"The earth's crust contains Oxygen" (certainly true in the sense of the
atom,  much of the mass of rock is oxides)
"The moon contains Oxygen" (same,  but 'everybody knows' the moon is a place
that's inhospitable for life because there's no 'Oxygen' there)

      Now the reason I'm uncomfortable with all this is that the term
'Oxygen' isn't atomic;  if we split it into 'Diatomic Oxygen',  'The Element
Oxygen',  and finer terms,  we can actually make assertions that aren't so
problematic.

      If we go to work seriously splitting up 'Oxygen',  there are the
senses of

'Oxygen as something essential to life' (How do you understand the Matthew
Sweet song "Love is Like Oxygen?")
'Oxygen as a medical treatment' (It's got some unique identifier that's
provided to your health insurancer to identify it as such)
A number of allotropes of Oxygen (I'd think that any worldy person ought to
know about O2 and O3,  but O2 actually has 'singlet' and 'triplet' forms,
and there are a few bizzare allotropes that have been made in the
laboratory.)
A number of isotopes of Oxygen

     If we stopped somewhere near there,  we'd have covered most of it,  but
scientists can split terms further.  For instance,  you could have O2 in the
triplet state with one 12O atom and one 23O atom.  I suppose physicists
could ask 'what if' the binding energies and masses of quarks were a bit
different and how that would effect stellar nucleosynthesis,  so somebody
could make assertions about the nuclear properties of oxygen isotopes.

    Note that the specification of terms like that is a process of division
(creating a finer vocabulary) and then composition (using combinations of
terms to specify new terms.)

    Thinking this way,  it's clear that owl:sameAs is something that ought
to be taken with a grain of salt.  In fact,  it's the least of the problems
that we face trying to 'make sense' of terms.

--------

     Getting practical,  I'm working on a vernacular vocabulary for human
settlements.  You might think this is something you could get in a can,  but
you can't quite.  The real requirement here is that end users see terms that
match the vernacular language they use.  For instance,  you can't tell
people that 'Wisconsin' or 'Kanagawa' are '2nd level administrative
subdivisions',  but rather,  that 'Wisconsin' is a 'State' and 'Kanagawa' is
a 'Prefecture'.

     Just about anybody will tell you that

"London is a city" and "Tokyo is a city"

     but neither of those is legally true;  both of those are administrative
divisions larger than a city.  However,  if you make a list of "major global
cities" people are going to think you're nuts if you don't put them on the
list.  Lately I've had a lot of attraction for the Germanic concepts of
'Stadt' (avoids the need to make the arbitrary division between 'City' and
'Town) and 'Dorf';  I live in something that definitely isn't a 'Stadt',
maybe not even an Anglo-Saxon 'Town' by technical terms.  (It feels more
like an 'administrative division' that happens to maybe have 3-6 'Dorfs' in
it and a lot of public land,  scattered houses and farms.)  However,  I
write a tax check every year to the 'Town of Caroline' so I've got to say
it's a 'Town'.

     I'm finding a need to create a specific 'Vernacular' vocabulary layer
so my systems make sense to people:  so they can understand what they see on
the screen and so full-text search works right.  The main
operational definition is "What do people commonly call it?"  If I can't
find a specific term,  I fall back on defaults.

     Systems like this suck at reasoning,  however,  because the terms are
all imprecise,  so for a lot of applications,  vernacular layers are going
to need to coexist with more specific terminology layers that have the right
technical properties for solving certain kinds of problems.

Received on Thursday, 13 May 2010 14:07:53 UTC