- From: pat hayes <phayes@ai.uwf.edu>
- Date: Thu, 13 Mar 2003 17:16:02 -0600
- To: Tim Berners-Lee <timbl@w3.org>
- Cc: Public W3C <www-archive@w3.org>
>On Tuesday, Mar 4, 2003, at 12:40 US/Eastern, pat hayes wrote: >[...] > >>>> >>>>?What is a definition? >>> >>>A definition is a text which describes the meaning of a term.. >> >>Describes to who or to what? You have to say who (or what) is >>expected to be able to read the text and extract the meaning. All >>this debate turns on the issue of having 'texts' readable by >>software, and what the limits of this are defined to be. Software >>can't read English text, but it can read and use RDF/S/OWL text. >> > >No, this debate does *NOT* turn on the software reading the specs. If specs count as definitions, then it does. But who mentioned specs? I asked you to define 'definition', and you started talking about texts, and I asked you who or what is reading the text, and you refer to specs. I feel like the ground is shifting as we talk. The semantic web depends - MUST depend - on having *meanings* processed by software. I think this is the basis of the mismatch in communication we are having over this issue. Read on for details. >To situate RDF and OWL within the real world, you rely on particular >vocabularies >which are processed by software which has been written by a human being who >has read the spec. True. But notice, the writer of the software has read the spec of the formalism - in our case, RDF, say - NOT the spec, if there is one, which defines the meaning of the particular vocabulary. He can't possibly have read that, in general, because the software he writes has to be able to handle vocabularies written to conform to specifications of meaning which have not yet been written or even thought of when he is writing the software. The RDF spec does not define the meaning of every RDF vocabulary. It would be useless if it did. Contrast this with, say, a Java spec, which does in a sense define the meaning of every Java program, at least to the extent that a Java interpreter written to the spec is capable of running any Java program that will ever get written. You keep repeating this, that software is written by people who read specs. It is of course true, but it is beside the point here. The SWeb puts us in a position (familiar to AI programmers) where we need to write software which ITSELF will be processing formal representations of meaning. When writing CWM, you weren't able to access the specs of the URIs in all the RDF that CWM might ever process. At that point, the spec which determines the meaning, and the programmer who writes the code, have an impassable barrier between them. The software has to operate on its own. > Open Financial Exchange bank statements are generate inside >a bank by a programmer who has read the OFX spec and understands what the >fields mean. On the consumer's side, on your desktop, the document is read by >software (such as Quicken) which was written by people who read or >even wrote the >OFX spec. (OFX is actually a SGML application but it could be an >RDF application >and the RDF mapping is straightforward) If you as a use RDF rules to >make a USA income tax return from that data, then you generate info in >in a vocabulary of IRS Form1040 line numbers. Maybe so, but the example seems to illustrate my point rather than yours: the RDF spec doesnt say anything about the meaning of this vocabulary. And there is nothing in the spec of RDF which says, in fact, that an RDF vocabulary need even have a spec, or be defined anywhere by anyone. Its just a vocabulary which someone is using to say something about something. RDF can be used by anyone to say anything about anything, right? There are ontologies out there already which talk about all kinds of things, eg take a look at the emerging DAML-time ontology (not yet put into DAML). There's lots of English in the document, but nobody is going to claim that the meaning of the English is part of the meaning *of the axioms*. >The meaning of these fields are defined in a human-readable document which is >online. In your example; but not in general. And even if it is, it may have no obvious connection to the actual base URI. For example, take RDF. If you follow the base URI of any piece of the rdf: namespace, eg rdf:type, you get to a page full of, guess what, RDF. You don't get to the RDF spec, which is another document altogether. Similarly for DAML, by the way. > The rules which you might use to classify your spending and income >maybe written in OWL and the data may be processed by an inference machine. > >Specification-wise, the inference engine is only authorized to make those >inferences because the OWL predicates, when looked up, point to the >OWL spec. The inference engine was not written by a computer program, >it was written by a person who read the english OWL spec. Of course, but this doesn't say very much. The point at issue here is saying what the meaning of a particular OWL vocabulary is. Most OWL vocabularies won't have any particular link to any spec at all, machine-readable or otherwise. >>> >>>For example, http://www.w3.org/TR/1999/REC-rdf-syntax-19990222/#property >>>is (part of) an english definition of a rdf:Property. >>> >>>I know its a bit ofashock, but we are not being formal here. >> >>I know we aren't being formal, but we do need to be more precise. >>One can be precise using English, believe it or not. >> >>>This definitoin >>>of property is a rough and ready english definition. But we get by with it >>>and the rest of the stuff about predicates in the document. >> >>Believe me, there are a lot of people in the various WGs who do NOT >>get by with this rough and ready a definition. Hence, I suspect, >>many of the communication problems we are having here. > >Well, the english definition of the property is what we have. It >includes reference to some >mathematical stuff in the spec,but i'd note that the model theory is >invoked by the >english, when you look at how this spec gets its authority, not the >other way around. That states a position in what is basically a debate within the philosophy of language. Myself, I don't agree with it; very few working mathematicians would, I suspect, and very few programmers would agree that the meaning of a programming language was rooted in the English of the manual rather than, say, the code of the interpreter. But in any case, its by no means obvious or even plausible as an account of where 'authority' arises here. Many philosophers would say that English meaning is rooted in a kind of social compact between cognitive agents, and many cognitive scientists (though not all) would go on to base that compact, in turn, in a presumed internal representation inside those agent's heads, and some (though not all) would say that those internal representations' relationships to the world they describe could be understood using model-theoretic terms. You can go around and around this circle, and its not clear where, if anywhere, you bottom out in an 'authority' or a 'grounding'. The truth is probably more that meaning arises as a complex system phenomenon, and different theorists can find it's 'origin' almost anywhere in the system, which tends to be where their particular discipline stops looking for explanations. I know some extremely clever folk who are sure that meaning is rooted, ultimately, in how the equations of quantum gravity affect the chemistry of microtubules inside neurons. By the way, your dismissive phrase 'mathematical stuff' isn't likely to endear your way of thinking to the technicians who are actually getting your semantic web built. You really ought to use a nicer tone when talking about the workmen, particularly when they might be listening. And it misses the essential point that 'mathematical stuff' can be about non-mathematical things. That is usually why mathematics is so useful, in fact: its just a very precise language for talking about whatever you are talking about, its not a geekish jargon for saying nothing in particular. God alone knows, this entire discussion could do with a little precision. >>>> >>>>>of the predicate, as applying to the subject and object identified by >>>> >>>> >>>>?How do the subject and object identify things? >>> >>>Um.. by using a URI, where sender and receiver share >>>information >> >>And if the sender and receiver are software, what does it mean to >>say that they share information? > >For formal stuff, there is a mapping from ?x to ?x.log:semantics >which is common (modulo real world >effects such as power outages, lying and cheating, etc) to both users. I do not follow your meaning here. What are ?x and ?x.log:semantics? And what has this got to do with the question I asked? > >For most software today, such as Quicken or Microsoft Money or Apple >iCal, the spec >has been read by a programmer who made the code. You keep saying things like this, and I keep thinking that it is entirely irrelevant. We seem to not be communicating at some very basic level. Look, code is written by people, lets agree. And people who write the code have read and understood the specs which define the programming languages they are writing the code in. OK, all agreed. Now, this all has absolutely nothing to do with what we are talking about. RDF and OWL are not programming languages, and understanding the semantics of the implementation language of an OWL inference engine does not contribute one jot or tittle to saying anything about the meaning of the OWL being processed by that engine. Even a spec which says something about the assumed meaning of something that might appear in an OWL vocabulary and get input to an OWL engine cannot have any relevance to what that engine does, since there is no requirement of the writer of the OWL engine to have read every spec of every information source of any vocabulary that the engine might ever have to deal with. >Actually, this applies to OWL inference engines. They can be >considered, from the web point of view, >just RDF systems which have been programmed to have an inherent >knowledge of the >meaning of certain terms. This allows them to do certain things on my behalf. Yes, of course. I presume by certain terms you mean those in the OWL namespace. But that is beside the point in this discussion. The question is not what the OWL:... URIrefs mean: we have that part reasonably well locked down. Its what all the OTHER URIrefs in an OWL document mean, the URIrefs in the subject position of the content triples, all those facts about whatever that are being OWL-manipulated. Neither the OWL programmer not the OWL inference engine is in a position to be able to follow those links and read and understand what it finds there. The programmer can't see those links, since he or she is long gone when the stuff is actually running, and the OWL engine can't read English. >>> which restricts >> >>restricts how?? >> >>>the assication of the URI to one thing >>>(or one thing withiin a given shared context). >>>I am not sure what level of answer you are looking for here. >> >>Well, even at the rough and ready level, it would be good to have >>some general guidelines which say how to use URIrefs to refer to >>things. Seems to me there aren't any right now (except for URNs). >>If a URIref is a URL then I can use it to locate a web page, or a >>document if you like. Now, how do I use that document to locate the >>referent? Are there any rules or guidelines about that, of any kind? > >Well, my Stack article outlines them. Pointer? > They are in fact defined by a rather large pile of specs. No, they CANT POSSIBLY be determined that way. Think about it: suppose you had to explain to a Martian how English personal names worked. You would have to talk about the *processes by which things get named*: baptism, christening, registration of births, etc.., legal conventions regarding name uses, stuff like that. Giving a spec of what a name 'is' wouldn't do the job. {Later: see below for qualifying comment.] >RDF's job is only to be a relatively simple link in the chain. > >They haven't been written up formally, I havn't seen them written up in any way at all, formally or otherwise. Ive read phrases like 'URIs are universal identifiers' many times, but that's just froth: that doesn't tell me HOW things get baptised with URIrefs. > but in the special case of an RDF document, >the RDF spec should tell you what an RDF document tells you about >the referent. What is doesn't tell you, and what no MT or language spec alone can possibly tell you, is how to determine what the referent IS. So why is what it says relevant, if you have no idea what it is saying it about? [Later: after reading your piece on HTTP URIs, that provides a nice illustration. If we say that these URIs are names for the documents you get by using the HTTP protocol on them, then this would be a very good 'extra-logical' way of saying what one class of names means and how to attach a name to a certain kind of thing. And I guess one could say that this is rooted in a pile of specs, but they are specs about how to actually USE these names to get at something. They are all about transfer protocols. Now, you can't transfer a car or a galaxy by using a protocol; so what kind of specs are going to provide ways of naming solid, non-informational things like that?] >>>Statements which restrct interpretations such that >>>within the domain of discourse, for any intepretation, >>>any things identified by the URI are equivalent? >> >>Well, that assumes that this is somehow done within a model theory, >>which would be nice, but has its limitations. There are some >>general results limiting the extent to which MTs can possibly >>restrict or impose referents: eg the Herbrand results show that any >>consistent set of statements CAN be interpreted so that the names >>all refer to themselves, so one can make an interpretation entirely >>out of symbols. For reasons like this, one usually expects that a >>theory of reference - of naming - requires something additional and >>external to the MT in order to 'ground' names in the actual world. > >Well, the MT, while central for you, is in fact a rather peripheral >(though indeed useful) part >of all this. I don't think Im being merely turf-defending if I beg to differ. Formal semantics aren't just a formal aside: they represent about 60 years of intensive investigation into the basic semantics of languages. Not just 'formal' languages, but ALL languages, ranging now as far as things like diagrammatic notations, cartographic conventions, programming languages, natural languages and all the notations of logic and mathematics (which includes all of science and engineering). They are about as central as you can get. Without a precise semantic theory you don't have squat, you are just making noises. Once you start specifying formalisms intended to express propositions, an MT (or some kind of formal semantic specification) becomes absolutely central, because it doesn't just tell you what is represented: it also tells you what is NOT represented. If you don't want it to do that, you really shouldn't have let the WG's publish an MT at all. You seem to want it both ways: a nice consistent web of interacting software, but also a nice blurry Humanist sense of open-ended Meaning. Until we can make human-level AI engines, you have to choose one or the other. >The RDF spec should tell me to go look in the spec of the predicate >to find out what p owl:inverse q means. It shouldn't do that for several reasons. First, most predicates will not have a spec. Second, *your* ability to read the spec, even if it exists, is irrelevant if you are not around when the RDF is being processed, and I presume the whole point of the SWeb is that you will not be around, but that software will be doing it for you. And third, even if you are around, and there actually is a spec for you to read, what relevance or utility does that have to someone who is writing an RDF processor? The RDF engine isn't going to even know what *you* think about what the spec says. >I go and look up the owl:inverse, get eth OWL spec. Actually you don't, see above. Not that it matters. >It is a bunch of english and RDF which refer to each other, and the >english bits >refer to the MT. A programmer can then write code But that is beside the point. Look, the whole idea here is to write RDF-handling code, right? I am talking about what that code is supposed to do. Thats what the RDF spec ought to do, to provide a guide for writers of that software. If I hear what you are telling me, the software ought to follow the subject links and read the specs it finds there. Since this is written in English, the poor dumb software presumably has to ask some human to read it, and wait for some input to decide what to do. This is bad enough, but now you are telling me that the RDF-handling code has to wait for those humans to write some other code? How fast is this whole thing supposed to run, for God's sake? >which will output >a p b given b q a >because he understood the spec with the help of the model theory. No, the job of the RDF programmer is to read the model theory and the rest of the spec, write a reasoning engine which performs valid RDF inferences (let us suppose) according to the MT, then to go away and let *that code that he has written* alone to process RDF at electronic speeds. God alone knows what the RDF it processes is going to be about or what the URIrefs in it refer to, but whatever it is about, he - the writer of the code of the RDF engine - has no way to read any of *that*. When the URIrefs in some RDF arrive down the optic fiber and get processed by the RDF engine, the programmer is OUT OF THE PICTURE. The RDF engine is alone with the RDF. Now, who is going to read those property URI specs, again? >That's the state of the art. When we can pick up (in the RDF bit) >axioms for the new terms, >then the state of the art will advance a bit, and some of the >functionality of the OWl >terms will be directly loadable by a more general reasoner. > >> >>>> >>>>Neither of these are easy questions to answer and neither of them >>>>has an answer in the current spec. >>> >>>No, that's good, because the questoin of what is identified by a URI >>>is dealt with in URI spec and associated specs. >> >>Not in any Ive read. > >Clearly not to your liking, and not really to mine, or we wouldn't have >a bunch of TAG issues on the subject. But the URI specs are the documents >which define that, in as much (a) you can understand what they mean >and (b) your philosophy allows for specifications at all of anything. I have read them as English, with an open mind, honestly trying to get something useful out of them, several times. I havn't set out to treat them like a philosophy essay, or to critique them. But they just don't work: they don't say what needs to be said, they contradict themselves (in the only way I can possibly understand the words in them) and they don't specify their meanings adequately. I *still* , after several years immersion in W3C business, have absolutely no idea what anyone means by "resource". I strongly suspect that everyone means something different. >>>The only question that RDF has to answer (not as part of >>>itself, but as part of a duty delegated from the URI spec) is to >>>show how, when the URI >>>is an identifier within an rdf document (a la foo.rdf#bar), to >>>show how RDF allows >>>the set of things which a URI might by identified to be restricted by >>>RDF statements about that thing, or as we say in english, how RDF >>>documents can describe things. >> >>OK, that is fine with me. But that way of saying things treats URIs >>as logical constants (ie things that denote whatever the logical >>constraints force them to denote ) rather than names (ie things >>that *have* a denotation to which they just *do* refer, attached to >>them by some extra-logical means, like "Patrick John Hayes" refers >>to me because it says so on my birth certificate.) > >Whether something is a constant or a name then depends on whether >the identification is "extra-logical", which >depends on the boundary you throw around some stuff you call your logic. That makes it sound like an empty terminological issue, but that gets it backward. Its extra-logical because it deals with issues that logic isn't designed to handle. I cast logic as wide as it can go, but I know that it has its limits. Specifying referents is beyond the competence of a formal logic. Formal logics were designed to analyze some aspects of human language, essentially, and they have been very successful at that analysis. But language has other aspects, and reference and naming is just elsewhere in the intellectual landscape. (This isnt an original thought, by the way.) >For the OWL MT by itself they are just names, while for the whole >semantic web they would >be more like constants? More like the other way round. There is more to a 'proper' name than its logical role as a constant. You can treat it as a logical constant in the logic, and that is fine: but that fails to capture something essential about it. As far as the logic is concerned, for example, you can interpret any constant as denoting itself. But a proper name really CAN'T be interpreted that way: it comes with a fixed referent. "Patrick John Hayes" really is *not* the name of a character string. >>The thing is, Im sure that most actual uses of URIrefs are more >>like names than like logical constants, in fact; but we don't have >>any rules for specifying how these names get their referents >>attached to them. (For example, does a URL *denote* the web page >>you get by using the http protocol on the URL? Some people assume >>it does, others make assumptions which are incompatible with that, >>eg Euler. There are no rules for this in the URI documents, which >>aren't worded with enough precision to even make the relevant >>distinctions.) > >Voila. Hence the TAG issue. My own view is clear. >http://www.w3.org/Designissues/HTTP-URI I tend to agree with you about HTTP URIs. It makes sense to assume that the denotation of an HTTP URI is the document-thingie you get when you use the HTTP protocols on it. I would be quite happy if the RDF MT were to mandate this, and then indeed we would have at least one class of genuine names. (We would need to get down to a bit more detail about individuation criteria for those thingies, so that OWL could use cardinality reasoning on them, but that would be fun.) But we still need a way to give names to things that aren't documents, and most interesting things aren't. >>In normal human society there are all kinds of such rules for >>attaching referents to different kinds of names (baptism, >>registration of a birth, marriage, ship naming ceremonies, >>whatever) and associated knowledge about how to recognize a proper >>name and how to access its referent, if you really want to get at >>it. >> > >Indeed. It is simpler on the web. It may be the first time one has been >able to formalize the process. Maybe, indeed. Interesting, and exciting! But I'd like to get into the details a bit more. I want (and I think we will very soon, if not already, NEED) a way to allow software to do the actual naming, or at least to be able to be told about it. We need protocols for actually doing Web baptism, or at least for adding new names to things that already have names (which can be done entirely using the names: "I hereby name the thing named "Patrick John Hayes" by the new name "Arthur-26". " Or maybe "The name "mydatatype:octal" names the rdf:Datatype which you can retrieve using this datatype API:...., ) BTW, this issue has already come up in the RDF spec, concerning datatypes. How does one baptise a new datatype, so that an RDF engine can be told about it? You 'give' it a URIref... but *HOW*? >>>>>If my:car :color :blue means that my car is colored blue, that >>>>>is what it means, quite independent of context. >>>>>The concept of something having a given color is >>>>>defined (and only defined) by the definition of color >>>> >>>> >>>>Bad example, as color terms don't have definitions. >>> >>>They do. Casesium red is the sprectrum of an excited Caesium atom. >>>Some are vauge -- red is a color which has predominatlylonger wavelengths. >> >>OK, true. But there is a sense of 'red' which can only be accessed >>by people who have color vision. Philosophy rears its ugly head.... >>OK, leave aside philosophy (that's one problem with using words >>like 'meaning') and we agree, it's fine to leave the definition of >>'red' to, say, Pantone. The issue for us here is what it means to >>say that Pantone 'defines' the 'meaning' of pantone:red35 , what >>RDF(S/OWL) needs to know about that kind of definition, and whether >>the RDF spec needs to say anything about that. > >The RDF spec only has to hand off authority to the pantone spec to >define the actual relation >identified by pantone:color. There's a step missing. OK, Pantone has the authority to *define* the name, say, pantone:red35. But something needs to be able to tell RDF that this *is* a name, and not just an arbitrary URIref. And ideally, RDF should have as part of its spec that when given a name, an RDF interpretation is required to interpret that name as denoting the thing it names (according the naming authority). That is in fact critical to the intended use of RDF, as this kind of example shows, but we can't say it at present in the RDF spec, because there isn't any global notion of what a name IS. Its easy to say (using some of that mathematical stuff that you disdain) but to do so requires one to refer to the name of a thing. And right now, there is no general way to refer to the Web name of a thing. Heres another way to say it. I want to make that notion of 'authority' that you used, rigorous enough so that I can write model theory equations about it, not just say things about it vaguely. I don't think it is *conceptually* hard to do this, but it requires something to be done at a very global level of organization for the Web. Its not just an RDF or an OWL issue, its a whole-Web issue. >>>>>and my:car only serves to idetify the car >>>> >>>> >>>>How does a uriref identify a car? (Genuine question, not rhetorical :) >>> >>>Notionally, the URIref identifies the car so long as everyone who uses >>>it does soconsistently with it identifying the car. >> >>But on the SW (for the first time?) 'everyone' has to be understood >>as including software agents, and so this begs the question, since >>we are left trying to figure out is how THEY can be said to >>'identify' a car when given a URIref. I think the best we can do >>here is to say that they can't actually do the identifying, but at >>least they can be required to pass information around in a way that >>doesn't screw up anyone else's identifications. Then allowing the >>sofbots into the social fabric doesnt add anything really new in >>the way of reference, but at least it causes no actual harm. >> > >Yes, we're not looking to a bot (in general) for an inherent >understanding of redness. But we ought to be ready for when they DO start to have one. Or at least when they start to contribute to that understanding in new ways, eg in the case of red, by having color-meters more sensitive than the human fovea linked to software agents which are making decisions about paint purchases, things like that. >(In fact we can build software which will tell you whether a picture >of something is red. >But we don't expect an OWL engine for example to be able to figure things out. I bet the time wont be far off when people expect to be able to have a reasoner figure out things which involve knowing that a color is the kind of property which can be handled by a certain kind of color-checking agent. Or that a car is the kind of thing that can be identified by checking its chassis number (chassis numbers can be used as names for cars). >You could actually write a level-breaking rule which told you >whether a one-pixel >GIF was red by allowing log:contents and some string matching expressions. Yuk. >But that is very much a corner case!) > >>>Specifically and practically, the semantic web protocol is that >>>a web page in RDF foo.rdf has a description of something >>>of type Car and typically giving a country code and plate number >>>as property values. >> >>Right, that kind of answer makes sense. Well, in one reading it >>does, ie if 'description' means RDF/S/OWL/... description. Then the >>formalism links the URIref to some other URIrefs which have >>socially recognized status as 'grounded' in real things like cars. >>That way of thinking is quite compatible with the formal MT for the >>web logics, and neatly separates the MT-handleable issues from the >>much scruffier notion of grounding. >> >>I think many of our communication problems over this issue come >>from confusing that reading with the reading in which 'has a >>description of' means a description in some language (eg English) >>which a softbot cannot hope to do any reasoning with. > >We will work in a world in which robots work with a subset of the knowledge, >but are better at handling that subset than we are. Right. And the first thing they will be good at is doing stuff that we can do, in fact, but much faster and larger-scale than we could do it, and without getting bored out of their tiny skulls. Pat -- --------------------------------------------------------------------- IHMC (850)434 8903 or (650)494 3973 home 40 South Alcaniz St. (850)202 4416 office Pensacola (850)202 4440 fax FL 32501 (850)291 0667 cell phayes@ai.uwf.edu http://www.coginst.uwf.edu/~phayes s.pam@ai.uwf.edu for spam
Received on Thursday, 13 March 2003 18:16:07 UTC