- From: Tim Berners-Lee <timbl@w3.org>
- Date: Thu, 17 Jul 2003 23:34:55 -0400
- To: pat hayes <phayes@ihmc.us>
- Cc: www-tag@w3.org
- Message-Id: <CCD886F9-B8D0-11D7-B920-000393914268@w3.org>
>> 1. " each URI >> identify one thing ("Resource": concept, etc)." > > Exactly what is meant by "identify" here is not exactly clear, but if > this means something close to what it usually means then it is simply > untenable to claim that all names identify one thing. > > I am making the claim only for RDF statements in a global context, in > for example an email sent between two people who don't know each other > but both access to the web. > > > So am I: and I insist that this stipulation of identifying one thing > isn't sensible or even desireable. Well, at least, unless that word > "identify" means something different from "refer to" or "name" or > "denote" . I think "denote" probably matches. I will try to use "denote". > What might indeed be true is that in many circumstances, a URI > somehow provides access to information which is sufficient to enable > someone or something to uniquely identify a particular thing (that the > representation accessed via that URI is in some sense about), but even > there the thing identified might vary between contexts (such as when > we use someones email address to refer to the person) without harm. This depends on what you mean by "contexts". If you mean that I can send one person an email saying (in RDF) <http://example.com/foo.rdf#bar> pantone:color "blue426" . and it can mean one thing and I send it to another person it can mean something else, then we do not have system of communication which has any properties at all. > This kind of ambiguity resolved by context is at the very basis of > human communication: it works in human life, Yes, with natural language and peotry > it works on the Web, Yes, when the genre is natural language and peotry, not mathematics, > it will work on the semantic Web. No. We are defining the semantic web NOT to work like natural language, but to work like mathematics. And it does not work in math. Suppose I give you two facts, that x=1 and that x=0. Not a problem, if one can assume the x denotes something different in the two cases. But very hard to build any logic at all. > Why do you want to try to legislate it out of existence? Any system of mathematics has to be able to use symbols to denote things in the universe of discourse. You as a philosopher can perhaps handle a mathematics in which symbols denote whatever anyone likes at any point, but I as an engineer find it less useful. > You will not be able to, any more than you will be able to stop > people falling in love. Ah, but people have stopped falling in love. Look up ... one by one the stars are going out. ;-) But seriously... > All that your 'ideal design' will accomplish is to make the > architectural pronouncements of the W3C more and more out of line with > the way that the Web is actually being used by real people. People are not using the semantic web now. There is not very much global math on the web. People use document identifiers as though they (in some sense) will from week to week denote (in some sense) the same thing, so people are very used to having a global space of identifiers. > Take your example of person A emailing person B, who A does not know. > What is actually going on here, described precisely, is surely that A > knows that 'B@Bsplace.org' is a character string which when used in a > certain way will (by some occult technical means about which A need > know very little) act as an address, so that email sent to that > address will arrive in the inbox of, and likely be read by, someone > called 'B'. I phrase it thus because it might be potentially > misleading to just say 'read by B' since that could be understood as > saying that A knows the referent of that name 'B'; but we are assuming > that A doesn't. So what A knows is in fact an existential: that a > person called 'B' *exists* who will get the email. Since A knows that > the email exists and is unique - A has direct acquaintance with the > email, having written it - this is enough for A to know that there is > a single person out there who will get the email. But it is still > misleading to say that the email address "identifies" B: if that > really were true, then A could find out who B was just by looking at > the email address. This is rather tangential. My example was of people emailing each other, and the content of the email having the semantics to A as to B. You discuss who is denoted by "B@Bsplace.org", an email address. The email address, (or we could tak of the related URI, "mailto:B@Bsplace.org ") in the semantic web, denotes, formally, something often referred to as an "RFC822 mailbox", and which is a conceptual thing to which mails may be sent to or from, among other uses. There is a relationship, one of whose URIs is http://www.w3.org/2000/10/swap/pim/contact#mailbox which relates a social entity (for example, a person) to one of these mailbox things (as a mailto: resource). People often use the approximation that contact:mailbox is inverse functional, allowing them to determine that two people are the same person because they have contact:mailbox, but that does not mean that in the formal system we are building to represent all this, that the "mailto:B@Bsplace.org" denotes the person. > And I am describing, if you like, a perfect platonic design, to which > we can aspire, though social and engineering factors limit our ability > to implement it perfectly. > > Allowing - no, admitting the existence of - referential ambiguity is > not an imperfection: it is a basic property of communications of > belief using language, one that is recognized and even described quite > well (to a first approximation) by the model theory that you dismiss. I do not dismiss model theory, I just pointed out earlier that your questioning of the use of "identifies" rather than "denotes" was asking me to use MT terms rather than other english terms. Now the model theory I have seen only describes the semantics of the OWL terms, in explaining how the statements that Fido is a dog and a dog is a subclass of animal constraint the possible interpretations. And this is done so as to work on any valid interpretation. I have not seen ( but I may have missed) the bit where when the english in a schema describes what the individual ex:fido is, that interpretations are further constrained to those in which "ex:fido" actually denotes the actual dog we all know and love as Fido. >> Like with all technical specs, the fact of imperfect adherence in >> some cases does not detract from the importance of having made the >> perfect idealistic design which has provable properties. One deals >> with deviations from the perfect in a form of perturbation theory. > > > We seem to be at cross purposes. Im not saying that the 'unique > identification' condition is an unattainable ideal: Im saying that it > doesn't make sense, that it isn't true, and that it could not possibly > be true. Im saying that it is *crazy*. Well, you have used "silly" and "crazy", but in the context of your statement they clearly denote the characteristics of being well thought out and essential to the architecture of the semantic web, respectively. > > Existing W3C standards already provide counterexamples: what single > thing is identified by the URI reference > http://www.w3.org/2000/01/rdf-schema#Class? This is supposed to > *denote* the class of all RDFS classes; but that is not a single > well-defined notion, by the very nature of formal semantics: it varies > from interpretation to interpretation. An interpretation is a mapping from names to things. What I am saying is that, if there are two interpretations, and the things denoted by that URI in those two implementations are demonstrably different, then it is reasonable to go back and ask the owner of the URI which one is denoted. The authority may decline to reply of course, but if it thinks and thinks and comes back with an answer, then that answer is added to the common information which we share, and one of the interpretations has to be dropped. > And there is the problem that MT systems consider all possible > interpretations of the data, in any possible worlds. > > That is not a PROBLEM; it is how semantics works. When you communicate > something to me, you send me some language (or more generally some > representations). I have to try to interpret this language and make of > it what I can. But you cannot POSSIBLY send me a single > interpretation: interpretations are not the kind of thing that can get > communicated. Only language gets communicated. Indeed, to communicate what something denotes one would need magic. Like telling a robot - you want to know what "hot" is? this is hot. And stimulating its temperature sensor. Communication doesn't allow any terms which everyone understands. Everything communicated is only a message, and the receiver can only sense the message and never know what it means. Nothing has fundamental meaning, a message will just have certain effects on certain agent, and agents will change their internal stored state as a result of them. > So yes, OF COURSE there are many possible interpretations of what you > say, even when I have used all my resources of interpretation. This > isn't a problem of the theory, it is a FACT ABOUT COMMUNICATION which > the theory recognizes and tries - in admittedly a crude way, but we > have to start somewhere - to deal with and come to terms with. This is theory of communication does indeed address how communication. However, different theories are used at different scales, and different stages in the analysis. [When we analyze how an electron behaves, we use quantum mechanics. We discover that the position and momentum of the electron cannot be known at the same time. This is just a fact about matter, which the theory recognizes and tried - in admittedly crude way, but we have to start somewhere - to deal with and come to terms with. .... We realize that application of that theory in great detail will allow us to make a wave equation for an apple, and we figure out that (though it is too complicated a job to do in practice) in any reasonable approximation, when considering 10^^23 particles, the result is that an apple has, to all intents and purposes, a given position and momentum at any time. It isn't that quantum mechanics doesn't hold for apples. It just isn't worth doing it when using real apples. As we take a bite, the theorist jumps up and down warning that it could jump sideways at any moment. The engineer takes the bite.] So let it be for the semantic web. Many agents have communicated at great length over what the URI daml:TransitiveProperty denotes. During this process, the people involved considered many interpretations. Not a ridiculous number, as few in the working group considered interpretations in which daml:TransitiveProperty denoted the dog we all love and know as Fido. But the process which you describe in capitals above took place. Drafts were written. Textbooks written ages a go and read by many were quoted. By the end of the process, after axiomatic semantics had been written up and reviewed, and a model theory had been written (in english), people went away and wrote programs which treated daml:TransitiveProperty in a particular way. People found that when one program generated a statement about something being a transitive property, the other program did good things. Now, no on can say that the people wring those two programs had the same interpretation of the spec, and really in theory shared a common thing as that denoted by the URI. But for that and several other URIs, the proof was in the eating. The programs worked. And will work, for lots of other people in the future. It is as though that bit of magic has happened. When you and I write an ontology for marsupials, we don't worry about differences in what we mean by "subclass". We only worry about what we mean by "duck-billed platypus". When we have finished our ontology of marsupials, and a thousand experts have poured over it and written commentaries on it, then millions of school kids will happily refer to the class of marsupials using our URI. The arguments will have been done. From the standpoint of the school kid, the class is a well-defined concept, where linguistic processes have long since tended to an asymptote, and any misunderstandings can be dealt with > If you take the case of an identifier for pat hayes > <phayes@ihmc.us>, for example, the non-logician would consider that it > identified one person and get on with their lives > > > The logician can say that also: it is the assertion that a single > person exists who has that name. But (1) that is not the same as > saying that the name - all by itself - "identifies" a single person > (or, well, maybe it is: but if so, then other things said about URIs > and resources are wrong) and (2) in fact, they don't assume that and > get on with their lives. Sometimes they assume it indicates a person, > sometimes a mailbox, sometimes a computer: it depends on the context. With strings, yes, not with URIs. There two reasons you are being confused. 1) Sloppiness. Human beings refer to things through the values of properies all the time ("ask fancy pants what he asked 411 for") , and figure out what people mean, in english but not in math. 2) Confusion with times when the design is to specifically and unambigously use a name of one thing to indirectly point to something else. You give an email address of a person who is going to attend a conference. The email mail box isn't going to attend the conference, and everyone knows that there is unambiguous traversal of a "contact:mailbox" arc involved. People are confused because a namespace (whatever that is) is indicated by giving the URI of a (maybe notional) namespace document which corresponds to that namespace. As the namespace is kinda abstract, and only the namespace document can be measured, this doesn't really matter. > Which is fine, let me quickly add, provided some > bull-in-the-china-shop authority doesnt keep insisting that all URIs > must by fiat always identify a single resource. Then we get > interminable arguments and discussions about what 'the' resource is in > this very case, and the people who are insisting on this doctrine so > firmly tend to be the ones who get exasperated earliest and tell us > that it doesnt really matter what the "resource" actually *is*; > apparently missing the irony of the fact that the only reason we are > having this argument is because of this insane ruling that they are so > insistent upon not budging from. Grrrr. Grr indeed! To what extent must we settle what the resource is? That is probably the question which divides our positions. I would say that when we have more than one candidate and these candidates are incompatible, ambiguity would lead to inconsistency, then we must settle it. Example1. A dog bounds into the room. Tim says, "Here, Fido!" to the dog, and says "Pat, meet my dog, Fido" to Pat. Tim plays with th edog. Tim asks Pat, "Pat, would please take Fido for a walk?" Pat takes the dog for a walk. The name seems to have been unambiguously associated with te same dog in both there minds. Example 2. Two dogs bound into the room. Tim says, "Here, Fido!" to the first dog, and says "Pat, meet my dog, Fido" to Pat. Tim plays with the second dog. Tim asks Pat, "Pat, would please take Fido for a walk?" Pat has to ask which dog is Fido. The name was not unambiguously associated with the same dog in both there minds. Pat hat to fix that before he could continue the conversation. Example 3. A dog and a cat bound into the room. Tim says, "Here, Fido!" to the first dog, and says "Pat, meet my dog, Fido" to Pat. Tim plays with the first dog. Tim asks Pat, "Pat, would please take Fido for a walk?" Pat takes the cat for a walk. "For me, in this context, 'Fido' denotes the cat.", he says as he leaves. Which scenario is insane? The first two? [...] Pat: >> That is, as we add information about it, that information should not >> be inconsistent. > > > Right. MT helps you there by providing a crisp notion of consistency. > It also gives you an important insight: if you know enough to uniquely > identify the referent of a name, then *any* further information is > either redundant or inconsistent. Basically, this follows from the > observation that the only proper subset of a singleton set is the > empty set. > > You can think of it denoting different things in different systems, > but how are those things "different" apart from the fact they are in > different systems? > > Well, how are they the same? That is, what gives us a licence to claim > that rdfs:Class and owl:Class, for example, are the same class? (In > fact, there is a good reason in this case to say they are not the > same.) The are not the same because owl:Class is a member of rdfs:Class but not of owl:Class, n'est-ce pas? For that reason I would say that it would be broken to use the same URI for the two classes. > Maybe you have a more direct acquaintance with abstractions like the > class of all classes than I do, but I sure wouldn't know how to decide > things like this in general, and I *know* that no computable decision > procedure could decide it for me. But no one asked for a computable decision procedure. The corners of math can throw up lots of tricky things which make people nervous, but fortunately the bulk of semantic web traffic will be in terms of things like date, totalamountinusdollars, financial instution identifier, etc. >> We say every owl:class is an rdfs:Class. That allows us to deduce >> things about some classes. Suppose we make other assertions about >> rdfs:classes, is it allowable for us to be able to make a >> contradiction? I would say not. Currently, different logical systems >> can deduce different things, but the important point is that they are >> talking about the same thing when they use the same URI. > > > You need to be careful what you mean by 'same thing'. Sure, if > reasoner A uses 'rdfs:Class' and reasoner B uses the same URI, then > they ought to both be using the name in the same way, so that they can > communicate. . . . . . . [12] Yes indeed. I guess that is what I wanted you to say all along. For all B. > Nobody is disagreeing with that. But that is not the same as saying > that there must be a single thing that this URI is naming. Analogy: > if we hold hands then we are walking the same way. But that does not > mean there is only one way we can possibly walk. I think you mean the > former, but you are saying the latter. So you accept that everyone must treat the identifier in the same way, that perhaps we could say that for two people it must identify the same thing, but not that there is one thing which is identifies? Can we not show that the two conditions are the same? Suppose there was not a single thing denoted by the URI. Then there must be two distinct things denoted by the URI. Those things to be distinct must be such that is an A uses one and B another as the referent of the URI in a message between them, A and B behave inappropriately and so the system is broken. We have honed this distinction down to a faction of a hairs width now. > [...] > Perhaps 'identify' doesn't mean 'denote' or 'refer to'. What does it > mean, then? Note that if we were to say that 'identify' means MORE > than simply 'denote' or 'refer to' - if, say, it also has a > connotation that the URI can be somehow used to retrieve some > information about the referent - then the claim would become even more > false. > > > When one retrieves a document, one gets information which its > publisher says, and one can believe or not. But using a term does > (modulo social things such as fraud and engineering things such as > broken cables) commit you to the term owner's definition of it, and > the document they publish at its URI is taken by design to be > information deemed shared by those using the term. That's the > contract. > > > Im happy with that contract, though with a slight hair-tingle at the > use of the word 'definition'. But nothing in there says that URIs must > uniquely identify resources: in fact, you didn't even use the words > "resource" or "identify" , which I am very happy to see were also > missing from Tim Bray's down-to-the-wire summary of the essential core > of things. > > [....] >>> First, OWL is more than an RDF vocabulary: it is an RDF vocabulary >>> with a particular semantics applied to it. >> >> Like every RDF vocabulary. What is interesting about OWL is that for >> some of the vocabulary the properties of the Properties can be >> defined in math. But basically OWL isn't any different from the >> calendar event vocabulary. The only reason that an RDF calendar >> event has meaning is the semantics of that vocabulary. > > As you know, I disagree profoundly with you on this issue. The > semantics of an calendar event described in RDF is given by the RDF > vocabulary. It is axiomatized in RDF. You can write as much as you > like about it and what you think it ought to mean: all that is merely > commentary and does not change the meaning *of the RDF* one iota. > That follows from the RDF specs themselves. I don't follow. Imagining schemas and specs where appropriate, what does [] rdf:type cal:Event; cal:dtstart "2003-07-31T12:00:00Z"; cal:end "2003-07-31T13:00:00Z"; cal:participant [ contact:mailbox <phayes@ihmc.us> ]. and why? >> (For you, this may seem perturbing or to say that the logic itself, >> the thing you tend to define first, is actually only defined in the >> data language. But it works. > > No, it doesn't, which is why I insist on my point. Does too. __________________________________________ At this point I am horrified to find myself only a fraction of the way through the email conversation, so I will hit send and keep the rest for another day Tim.
Attachments
- text/enriched attachment: stored
Received on Thursday, 17 July 2003 23:34:55 UTC