RE: philosophy of SWBPD (was Re: [OPEN] and/or [PORT] : a practical question)

Great starting place - here's my counter example

>To shed light on what it means to mismodel something, let us 
>identify some criteria for assessing the 'goodness' of ontology 
>modeling choices. Note that most are far more difficult to measure 
>than precision/recall.  Here are a few off the top of my head.
>1. logical consistency - Lets hope we can all agree on this one. It 
>can also be easy to measure, if the logic is complete.

on the semantic web I believe most interesting applications will be 
logically inconsistent - here's a real world case - we defined a 
terrorism ontology for one of our funders - being good logicians we 
asserted that a person has a single value for his/her height. 
However, when filling it in we found different people had different 
opinions of the height value for Osama Ben Laden (some documents say 
he is 6'5", some say 6'3") - so in the end our application ends up 
not enforcing cardinality constraints, but rather tracking 
provenance, and our ontology, in any sense of the word, is 
inconsistent because we have numerous instances that violate the 
logic (and this is a simple case, we have much more complex ones that 
clearly make our ontology formally inconsistent)

>2. various OntoClean criteria, which help to identify

since you don't say what these are, I cannot argue

>3. perspecuity: it should be easy to look at and understand a model. 
>Some 'correct' approaches may be very convoluted, but this make them 
>less desirable.

well, this one I guess I agree with, but of course Ian has lots of 
examples of things that would be easy to say in OWL Full (like the 
metamodeling itself) that require much less perspicuous 
representation in DL - in fact, the OWL Guide is full of them -- 
however, other people argued to me that making the representation 
easier for machines to reason over was more important than having the 
model be human understandable..

>4. similar things should be modeled similarly, this also helps perspecuity.

I think similar things should be modeled differently when used for 
different purposes -look at some of the differences in representing 
people between FOAF, used to be a kind of generic representation for 
lots of stuff about people, and OpenCyc, meant to be a fairly precise 
representation for inferencing about what kinds of people are in 
different categories, etc.   Seems to me that (i) both are useful, 
(ii) they are not directly mappable, and (iii) the users of one would 
be extremely unhappy using the other.

>
>We can also identify general patterns of ways that tend to commonly 
>arise that have a low score by these criteria, and recommend these 
>as bad practice.
>

Look - my goal here isn't to be difficult -- it's to remind everyone 
that we are not writing up "AI Ontology" Best Practices.  We're 
writing up SEMANTIC WEB best practices, and we're still very early in 
that game, largely making it up as we go along.   We have already 
learned important lessons in Semantic Web projects(for example, it is 
often better to build an ontology for your corporate need from 
existing data representations than from scratch) but there are many 
things we think we know, but when we go to apply them ON THE WEB we 
discover the game is different.

Let's focus on sharing the things we're learning from applying RDF 
and OWL, not from the previous years of other languages -- they are 
simply not the same
  -JH
p.s. Someone offered me a great analogy the other day - he showed me 
some papers from early 90s hypertext conference than basically 
recommended that you don't link to things off your own web site, as 
this could lead to 404 and other unanticipated errors (and showed 
empiricially that this was the case).  Of course, in a certain sense 
that is a good hypertext best practice, but it turned out to be a 
laughably foolish one for the Web -- let's try to avoid having people 
laugh at us ten years from now...
-- 
Professor James Hendler			  http://www.cs.umd.edu/users/hendler
Director, Semantic Web and Agent Technologies	  301-405-2696
Maryland Information and Network Dynamics Lab.	  301-405-6707 (Fax)
Univ of Maryland, College Park, MD 20742	  240-277-3388 (Cell)

Received on Monday, 29 March 2004 20:16:22 UTC