Re: Exact Search from Drew McDermott on 2006-03-18 (public-sws-ig@w3.org from March 2006)

From: Drew McDermott <drew.mcdermott@yale.edu>
Date: Sat, 18 Mar 2006 17:12:05 -0500
To: public-sws-ig@w3.org
Message-ID: <17436.34229.392935.385853@DVM-Powerbook.local>
> [Gerard McGovern]
>  
> I have an area of interest which I am hoping Semantic Web Services
> Interest Group  may take an interest in. It covers some of the same
> territory as "web services" and the "semantic web".
> I call the area "exact search". 

I have read your essay, and it's quite interesting.  If it has not
gotten much of a response, that's because it falls outside the narrow
technical scope that most of work discussed in this mailing list.  On
the other hand, it's not that far outside.

If you've been following the recent debate about the prospects for
extending WSDL (but not standing too close, I hope), you can see that
a key issue is how to build on what already exists in getting to a
"semantic" web.  For some people the slightest departure from the
current WSDL path ordained by Microsoft and IBM is suicide (even if we
don't know what the path is yet!).  But there is a fairly large
community (or perhaps several such communities) out there that are
willing to imagine the future from the top down rather than from the
WSDL up.  The one that comes closest to your vision is what could be
called the "Common Logic" community, after the effort being led by Pat
Hayes and John Sowa.  (http://philebus.tamu.edu/cl/)  They start from
the question, "How can we represent rather arbitrary pieces of
information that might be located on the web?," which is not that
different from some of the questions you're asking.

I wish I had the time to go through your document point by point and
make comments, but I don't.  So let me make a few general remarks:

- The classic problem for proposals like yours is, Who formalizes all
  that information?  The simplest pieces can come from existing
  databases, interpreting relational tuples as atomic formulas.  It
  gets harder to see how the various pieces of formal representation
  are going to be generated that are required to answer a question
  like "What is the product number 'NM23000Z' used for?"

- Your proposals for how terms should be used on the web make a lot of
  sense.  However, much as I admire Wittgenstein, I don't see how the
  idea of "language game" is such a huge improvement on "ontology."
  People tend to think that "ontologies" are axiomatic theories that
  explain everything there is to know about a set of terms.  This is
  ridiculous on its face, for the reasons you allude to.  Most of the
  meaning of common terms can't be captured by necessary and
  sufficient conditions.  A set of terms can be related to other terms
  using axioms, but (a) extreme care has to be taken in treating the
  exceptions to the axioms; and (b) one inevitably needs to introduce
  further terms, which either commits you to axiomatizing the whole
  world or (the sane alternative) force you to quit with lots left
  out.

  But so what?  If we put up a web node that uses a certain
  vocabulary, we basically want to make sure that we're using the
  terms the way other people putting up web nodes do.  The ontologies
  are there to (a) help document what the terms are supposed to mean;
  and (b) make sure the computers can do some of the reasoning we
  would do with those terms if we didn't want to delegate it to the
  computers.  If an ontology is lacking some important constraints
  relating the terms, causing confusion to people or inference failure
  by computers, then we add axioms to it.  The ontology is never
  complete in any sense.

- Your idea of employing interpreters is a good one.  It's been
  explored by several people including me and my students.  

  Dejing Dou, Drew McDermott, and Peishen Qi 2005 ``Ontology
  translation on the Semantic Web.''  {\it LNCS Journal on Data
  Semantics \bf 2}, pp.~35--56.

  Jose\'e Luis Ambite, Craig A. Knoblock, Ion Muslea, and Andrew Philpot
  2001, Compiling source descriptions for efficient and flexible
  information integration. {\it J. Intelligent Information Systems \bf
  16\rm(2)}, pp.~149--187

  The idea of "mediator" in WSMO (Web Service Modeling Ontology) is
  similar to what you call "interpreter."  See http://www.wsmo.org/

  There's an even larger literature on matching two vocabularies in
  order to find mappings that an interpreter might be built on.  E.g.:

  Jayant Madhavan, Philip A. Bernstein, Pedro Domingos, and Alon Halevy 2002
  Representing and Reasoning about Mappings between Domain Models.
  {\it Proc. AAAI 2002}.

  The database community has been working on related
  problems for years.  See

  E. Rahm and P. Bernstein 2001 A survey of approaches to automatic
  schema matching. {\it VLDB
  Journal, \bf 10\rm(4)}, pp.~334--50, 2001.

- One other reason to resist the use of the phrase "language game" is
  that it runs the risk of anthropomorphizing those poor dumb
  computers.  I'm sure Wittgenstein would say that even if SW is a
  fantastic success, it will still be the people playing the language
  games, using the computers as tokens.

  Wittgenstein was trying to explain how meaning works, and the phrase
  "semantic web" makes it sound as if meaning is somehow critical to
  our enterprise.  It is _not_.  Our central problem is _inference_.

-- 

                                         -- Drew McDermott
                                            Yale University
                                            Computer Science Department
Received on Saturday, 18 March 2006 22:09:33 UTC