- From: pat hayes <phayes@ai.uwf.edu>
- Date: Fri, 18 May 2001 22:16:58 -0500
- To: Stefan Decker <stefan@db.stanford.edu>
- Cc: www-rdf-logic@w3.org
>Hi Pat, > >>>Rather we focus on small subsets and worry how to make them interoperable. >> >>What exactly does 'interoperable' mean? Does it imply mutually >>consistency, for example? (If not, what does it mean?) If so, then >>it would seem to presume that the people/agents/thingies in these >>small subsets are at least using a language to communicate with one >>another that has a clear notion of mutual consistency. And that >>requires a semantics. >You are arguing in the abstract. Let's look into a more concrete example, OK, but I will hold you to that. Read on. >eg. the scenario that Tim, Jim and Ora constructed in the Scientific >American example: >http://www.scientificamerican.com/2001/0501issue/0501berners-lee.html > >"At the doctor's office, Lucy instructed her Semantic Web agent through her >handheld Web browser. The agent promptly retrieved information about >Mom's prescribed treatment from the doctor's agent, looked up several >lists of providers, and checked for the ones in-plan for Mom's insurance >within a 20-mile radius of her home and with a rating of excellent >or very good on trusted rating services." > >Lets translate the to the actual data flow on the web: OK, but the data flow doesnt interest me; I was talking about the content being transmitted and how it is encoded. >Lucys agent contacted the doctors agent. >The doctors agent is a webservice the doctor offers on the Web. >The webservices understands a certain query language and delivers >the resulting data in >a single, simple data schema (read: Ontology). OK; now, lets think about what that query language needs to be able to say. It can talk about distances from locations, insurance providers, prescribed treatments, and ratings, for a start. Presumably these concepts are not going to be incorporated into the very syntax of the scheme - if they were, these agents wouldnt be able to talk about anything else. So the notation through which this information is conveyed must be capable of supporting inferences involving some facts about quite a rich variety of entities. It must be capable of expressing disjunction (to be able to say that any of the providers in the list will do), quantification over numerical ranges (to get the 20 miles right), arithmetic (to compare distances and costs), and negation (to be able to infer that some providers are not in plan.) Maybe some of these can be hacked around in various ways, but this seems overall like enough of a semantic burden to take us well beyond current DAML+OIL expressiveness, say. >Then Lucy's agent looked up several lists of providers, probably from a >Yellow Pages webservice, which again understands a certain, simple >query language >and provides data in a simple data format. A Yellow Pages service must be capable of receiving queries about almost any topic under the sun. I would hesitate before calling this a 'simple data format'. This would be a large-scale research challenge, well beyond the current state of the art in ontology design. >Then each provider is contacted. Same game: each provider >understands takes a simple >query language and provides data in a simple data format. What makes these so 'simple', in your view? I see hard problems everywhere here. If all the medical providers have agreed on a medical-provider-ML format, then of course things might be relatively simple for the agents, but the queries are likely to be pretty complex in any case, and something has to be able to translate from more general-purpose formats to this hypothetical medical-provider format. None of this is 'simple'. >We are not talking about large, sophisticated ontologies - we are >talking about >domain models for small domains and services. General-purpose reasoning, even about small domains and services, is NOT simple, and it does require sophisticated ontologies. I think you are falling into a well-known fallacy that AI has learned to avoid: the idea that things that people find easy must be relatively easy to hack up in simple terms. >The challenge is now: we have 1 Billion different simple query >languages and data structures. >What is the common ground that we relate each data set to each other? >We are not really talking about semantics here - that is another question. No, you are talking about semantics. How do you think that a billion data formats are going to be made consistent without considering semantics? (Perhaps 'semantics' means something different in database land?) >We are talking about the foundation that is necessary >to relate one 1 Billion webservices to each other That "relate" means SEMANTIC relationships. If all we need to do is connect them without caring what they mean, they can all just use HTML. The point is to connect their CONTENT. >and that saves one to write >converter from each of the 1 Billion webservices to each other. >The solution is to come up with a joint data model - and guess now >- yes, graphs. Graphs are merely a notational device. They are not a data model (in any useful sense). They can represent anything, as we have all known for a very long time, but (for all but very simple data) the conventions that define the meanings of those representations are not themselves in the graph: they are encoded in the labellings of the nodes and arcs. >The database community has given the answer a couple of years and >came to graphs as a data representation mechanism. Congratulations. You were about a century late, but I'm glad y'all finally made it. >The basic idea is >that every kind of data can be represented as graphs. Thus this provides >a common ground and allows integration algorithms work easily with >multiple sources. Of course this is NOT a solution to resolve semantical >differences - it is just the necessary first step to provide a common >infrastructure. >Have I made clear, that I'm not talking about semantics here? If you are not, then that makes what you say irrelevant, since without semantic connections it is *trivial* to create a "common infrastructure". Plain ASCII text could be a "common infrastructure", if we don't care what it means. (Hey, it was good enough for Gutenberg, why not?) Or arbitrary graphs, as you seem to prefer, though it provides no great advantage over text strings. Any graph can be encoded as a table of textual triplets, after all, with only linear cost. >Should I repeat it? It won't get any better the second time. ...... > >From database research, it is well known, that semi-structured data >(a graph form) is useful for mapping between heterogeneous datasources. >(see eg. Thanks for the pointers, which I have (rapidly) scanned. As far as I can see, the following is mostly concerned with data in the sense that it is retrieved for human use, so that for example data formats should be 'self-explaining' in ordinary language text, ie self-documented. But this is of no interest for the Semantic Web, surely, which is supposed to be allowing interoperability between mecahincal inference engines, not human readers. >Papakonstantinou, Y.; Garcia-Molina, H.; Widom, J. >Object Exchange Across Heterogeneous Information Sources >1994,ICDE '95 >http://dbpubs.stanford.edu/pub/1994-8 .... >This following paper shows that there is much more to do in this space. >There is a lot of fine structure here, that needs to get exploited, which >really helps to resolve semantic differences in a cost effective way. >Also this paper hardly scratches the surface. >A Layered Approach to Information Modeling and Interoperability on the Web >by Sergey Melnik, Stefan Decker >ECDL 2000 Workshop on the Semantic Web >21 September 2000, Lisbon Portugal As far as I can tell from reading this paper, we are in broad agreement. You also refer to the translations of more complex logical forms (such as lists) as "implementations" in triples, at any rate. However, you say a number of things in this paper that you really shouldnt have said. For example: " Terms and expressions in these languages are first-class objects that can be manipulated on the object layer. In this way, applications can dynamically learn the semantics of previously unknown languages. " which is just fantasy. (But now I have some idea where Tim B-L gets some of his wilder ideas from, maybe?) "Reification of links and associations..... These two kinds of reification provide the necessary prerequisites for computational reflection, i.e. the capability for a computational process to reason about itself [Smi96]." Well, necessary, but nowhere even close to sufficient, so this is very misleading. You need, in addition, at least upward and downward reflection and a truth-predicate, an ability to quantify over the reified syntax, an ability to describe the structure of the reified syntax and probably some way to combine a least-fixed-point semantics for reflection termination with a model theory for the Krep language (a theoretical task that as far as I know is beyond even the Scott-Plotkin semantics for lambda-calculus.) As far as I know, this has never been implemented in any working system. LISP is probably the closest, but itis purely a functional evaluation language, and doesnt have quantifiers, so to call it 'reasoning' is stretching the terminology. Pat --------------------------------------------------------------------- IHMC (850)434 8903 home 40 South Alcaniz St. (850)202 4416 office Pensacola, FL 32501 (850)202 4440 fax phayes@ai.uwf.edu http://www.coginst.uwf.edu/~phayes
Received on Friday, 18 May 2001 23:16:59 UTC