- From: Jim Hendler <hendler@cs.umd.edu>
- Date: Mon, 29 Mar 2004 18:36:37 -0500
- To: "Uschold, Michael F" <michael.f.uschold@boeing.com>, "Bernard Vatant" <bernard.vatant@mondeca.com>, "SWBPD" <public-swbp-wg@w3.org>
Great starting place - here's my counter example >To shed light on what it means to mismodel something, let us >identify some criteria for assessing the 'goodness' of ontology >modeling choices. Note that most are far more difficult to measure >than precision/recall. Here are a few off the top of my head. >1. logical consistency - Lets hope we can all agree on this one. It >can also be easy to measure, if the logic is complete. on the semantic web I believe most interesting applications will be logically inconsistent - here's a real world case - we defined a terrorism ontology for one of our funders - being good logicians we asserted that a person has a single value for his/her height. However, when filling it in we found different people had different opinions of the height value for Osama Ben Laden (some documents say he is 6'5", some say 6'3") - so in the end our application ends up not enforcing cardinality constraints, but rather tracking provenance, and our ontology, in any sense of the word, is inconsistent because we have numerous instances that violate the logic (and this is a simple case, we have much more complex ones that clearly make our ontology formally inconsistent) >2. various OntoClean criteria, which help to identify since you don't say what these are, I cannot argue >3. perspecuity: it should be easy to look at and understand a model. >Some 'correct' approaches may be very convoluted, but this make them >less desirable. well, this one I guess I agree with, but of course Ian has lots of examples of things that would be easy to say in OWL Full (like the metamodeling itself) that require much less perspicuous representation in DL - in fact, the OWL Guide is full of them -- however, other people argued to me that making the representation easier for machines to reason over was more important than having the model be human understandable.. >4. similar things should be modeled similarly, this also helps perspecuity. I think similar things should be modeled differently when used for different purposes -look at some of the differences in representing people between FOAF, used to be a kind of generic representation for lots of stuff about people, and OpenCyc, meant to be a fairly precise representation for inferencing about what kinds of people are in different categories, etc. Seems to me that (i) both are useful, (ii) they are not directly mappable, and (iii) the users of one would be extremely unhappy using the other. > >We can also identify general patterns of ways that tend to commonly >arise that have a low score by these criteria, and recommend these >as bad practice. > Look - my goal here isn't to be difficult -- it's to remind everyone that we are not writing up "AI Ontology" Best Practices. We're writing up SEMANTIC WEB best practices, and we're still very early in that game, largely making it up as we go along. We have already learned important lessons in Semantic Web projects(for example, it is often better to build an ontology for your corporate need from existing data representations than from scratch) but there are many things we think we know, but when we go to apply them ON THE WEB we discover the game is different. Let's focus on sharing the things we're learning from applying RDF and OWL, not from the previous years of other languages -- they are simply not the same -JH p.s. Someone offered me a great analogy the other day - he showed me some papers from early 90s hypertext conference than basically recommended that you don't link to things off your own web site, as this could lead to 404 and other unanticipated errors (and showed empiricially that this was the case). Of course, in a certain sense that is a good hypertext best practice, but it turned out to be a laughably foolish one for the Web -- let's try to avoid having people laugh at us ten years from now... -- Professor James Hendler http://www.cs.umd.edu/users/hendler Director, Semantic Web and Agent Technologies 301-405-2696 Maryland Information and Network Dynamics Lab. 301-405-6707 (Fax) Univ of Maryland, College Park, MD 20742 240-277-3388 (Cell)
Received on Monday, 29 March 2004 20:16:22 UTC