- From: Paul Houle <ontology2@gmail.com>
- Date: Thu, 13 May 2010 10:07:02 -0400
- To: Linked Data community <public-lod@w3.org>
- Message-ID: <AANLkTin3x3ATsQABUPpyuVjpPBVbWMLcsr-NOx1DIjPm@mail.gmail.com>
I've been doing a reconciliation project between DBpedia, Freebase, Open Street Maps and a few other sources and that's gotten me thinking about the practical and philosophic implications of terms. In particular, I've been concerned with the construction of 'Vernacular' meanings that may not be perfectly precise but would seem normal to people. One thing I've realized is that no term is atomic, and a corollary of that is that A != A in commonsense reasoning. Openmind Commonsense, for instance, contains a number of english sentences that make assertions about the term "Oxygen", such as "The earth's atmosphere contains Oxygen" (obviously true) "The ocean contains Oxygen" (true in two different senses: there is dissolved diatomic oxygen in the ocean that fish can breathe; however, most of the mass of the ocean is oxygen atoms that are part of water molecules. However, we can't breathe the oxygen in the ocean, so I couldn't berate somebody for answering 'no' to this question.) I felt uncomfortable confirming the truth about the above term, but it gets worse "The earth's crust contains Oxygen" (certainly true in the sense of the atom, much of the mass of rock is oxides) "The moon contains Oxygen" (same, but 'everybody knows' the moon is a place that's inhospitable for life because there's no 'Oxygen' there) Now the reason I'm uncomfortable with all this is that the term 'Oxygen' isn't atomic; if we split it into 'Diatomic Oxygen', 'The Element Oxygen', and finer terms, we can actually make assertions that aren't so problematic. If we go to work seriously splitting up 'Oxygen', there are the senses of 'Oxygen as something essential to life' (How do you understand the Matthew Sweet song "Love is Like Oxygen?") 'Oxygen as a medical treatment' (It's got some unique identifier that's provided to your health insurancer to identify it as such) A number of allotropes of Oxygen (I'd think that any worldy person ought to know about O2 and O3, but O2 actually has 'singlet' and 'triplet' forms, and there are a few bizzare allotropes that have been made in the laboratory.) A number of isotopes of Oxygen If we stopped somewhere near there, we'd have covered most of it, but scientists can split terms further. For instance, you could have O2 in the triplet state with one 12O atom and one 23O atom. I suppose physicists could ask 'what if' the binding energies and masses of quarks were a bit different and how that would effect stellar nucleosynthesis, so somebody could make assertions about the nuclear properties of oxygen isotopes. Note that the specification of terms like that is a process of division (creating a finer vocabulary) and then composition (using combinations of terms to specify new terms.) Thinking this way, it's clear that owl:sameAs is something that ought to be taken with a grain of salt. In fact, it's the least of the problems that we face trying to 'make sense' of terms. -------- Getting practical, I'm working on a vernacular vocabulary for human settlements. You might think this is something you could get in a can, but you can't quite. The real requirement here is that end users see terms that match the vernacular language they use. For instance, you can't tell people that 'Wisconsin' or 'Kanagawa' are '2nd level administrative subdivisions', but rather, that 'Wisconsin' is a 'State' and 'Kanagawa' is a 'Prefecture'. Just about anybody will tell you that "London is a city" and "Tokyo is a city" but neither of those is legally true; both of those are administrative divisions larger than a city. However, if you make a list of "major global cities" people are going to think you're nuts if you don't put them on the list. Lately I've had a lot of attraction for the Germanic concepts of 'Stadt' (avoids the need to make the arbitrary division between 'City' and 'Town) and 'Dorf'; I live in something that definitely isn't a 'Stadt', maybe not even an Anglo-Saxon 'Town' by technical terms. (It feels more like an 'administrative division' that happens to maybe have 3-6 'Dorfs' in it and a lot of public land, scattered houses and farms.) However, I write a tax check every year to the 'Town of Caroline' so I've got to say it's a 'Town'. I'm finding a need to create a specific 'Vernacular' vocabulary layer so my systems make sense to people: so they can understand what they see on the screen and so full-text search works right. The main operational definition is "What do people commonly call it?" If I can't find a specific term, I fall back on defaults. Systems like this suck at reasoning, however, because the terms are all imprecise, so for a lot of applications, vernacular layers are going to need to coexist with more specific terminology layers that have the right technical properties for solving certain kinds of problems.
Received on Thursday, 13 May 2010 14:07:53 UTC