- From: Patrick Stickler <patrick.stickler@nokia.com>
- Date: Tue, 4 Mar 2003 06:57:32 -0800
- To: "ext pat hayes" <phayes@ai.uwf.edu>
- Cc: w3c-rdfcore-wg@w3.org
Pat, Person names are not globally unique. That's why we have identifiers such as social security numbers,so that we have globally and temporally unique names. URIs are intended to be globally and temporally unique names. If the RDF MT does not reflect that, then it is IMO fatally flawed. Let's discuss this in Boston... (hopefully you're attending the plenary) Patrick _____________Original message ____________ Subject: RE: designating datatypes Sender: ext pat hayes <phayes@ai.uwf.edu> Date: Tue, 4 Mar 2003 06:50:05 -0800 Patrick, your recent emails on this topic have made me think we have a major disconnect, so rather than respond point-by-point, I'm addressing the central issue first. I gather that you feel that RDF treats URirefs as names, in the sense that each URIref is the name of some thing, and that different interpretations, although perhaps they differ in what facts are true about those things, treat the names themselves are 'rigid', ie a given name denotes the same thing in every interpretation. (The term 'rigid' for this idea is common, so I will use it from now on, OK?) This is quite wrong. There is no basic assumption of rigidity in RDF, and indeed there isn't in most assertional languages. It is sometimes claimed that natural language names (certainly *proper* names, such as "Patrick Hayes" and "Delhi") are, or should be, rigid; but even this assumption raises a host of problems when one considers how language is actually used. Consider what it that you learn when you hear something new in natural language. I will tell you some facts about a guy I know. Imagine you get them one at a time, and ask yourself at what point you know who this guy actually is. 1. His name is 'Joe'. 2. He is a professor at University of Illinois. 3. His basement workshop is a joy to behold. 4. He is a neuro guy in the psychology department. 5. He does experiments with cats which involve doing delicate surgery on their eyeballs using loops of fine gold wire. 7. His cats love him but he keeps them permanently hungry 8. His surname is unusual and rare in the US phone directory 9. His surname is 'Malpeli'. Probably, 1+2+9 would be enough for you to locate him using Google. With a bit more work you might be able to do it just using 1+2+4+8, or even just the first three, depending on how good the department's webpage is. But if you just had, say, 1 through 7, the chances are that although you would know quite a lot about his guy, what you know could be accurately paraphrased as an existential: you know some guy *exists* whose name is "Joe", etc.. But you aren't able to access the actual guy himself. What this means is that *there are some interpretations of what you know* in which the description *doesn't* pick out my friend Joe, but in which the name "Joe" denotes someone (or even something) else. That is the model-theory way of saying that something isn't known, or that some information isn't available: there exist interpretations in which it is false. In other words: it could be false, for all you know. Its being false is consistent with what you know. Consider what it would mean to say that some name, say "Joe Smith" referred to the same thing in all interpretations. It would mean that it would be *logically impossible* for this name to be misunderstood: anyone (or any RDF reasoning engine) should be able to figure out, by pure logical thought from first principles, who it was that "Joe Malpeli" referred to, just from the name alone. All names would be tautologies. This is like the old magical idea of things having a 'true name' which, when known, gives the knower magical powers over the denotee. This is a real problem in Earthsea, but here in reality things don't work like that. Now, there is a case to be made out for 'proper names', which of course are primarily used to identify people and things. When you read my first sentence, you probably figured out immediately that "Joe" wasn't some arbitrary string, but was part of a proper name; and once you had the whole name, you know how to go about finding the referent of it, if you really need to find him. Unless you actually go to Illinois, or give Joe a phone call, you will still only really find out more *about* him, in fact, even if you have his website and his SS number and so on. But proper names are special because they are 'grounded', ie sufficiently uniquely attached to their intended referents (by social conventions, etc.) that they provide a quick and reliable way to identify referents. SS numbers are even better (in the USA) and phone numbers and email addresses and IPAs and all the other machinery provide many other examples of 'grounded' symbols which we rely on. But notice that this grounding relies on something external to the model theory: it has to, in fact. Even if you knew that "Joe" was a proper name, that wouldn't in itself allow you to draw any more valid inferences about Joe. The reason why we have to do something special for datatypes is that we are *assuming* that the datatypes are provided by some external source, but also assuming that *we* will use a URIref to refer to the particular datatype. Now, who gets to decide which datatype a given URIref refers to? Obviously, the external source, right? So in this case we are using the uriref as a proper name, in effect; but URIrefs aren't proper names; so all I want to do is to say explicitly that in this case, in datatyped interpretations, we *will* treat these URIrefs as rigid in this sense: we will insist that the provided name of the datatype will continue to be a name of that datatype when its used in RDF. We have been making this assumption all along, but implicitly; all I want to do is make it formal and precise. ..... > >I'm presuming that the MT would say > > I(uriref(x)) = x Well, yes, it could; but for a normal interpretation there wouldn't be any point in saying this since it follows from the definition of 'x' in those cases. That is, we start with the urirefs themselves and we interpret them using I, so the 'x' in your equation *is* I(<someuriref>), and then uriref(I(<someuriref>)) has to be <someuriref>, so the condition is vacuous. But datatypes are different precisely because they are defined outside RDF, by some external authority and that external authority gets to give them a name. (If they didn't, how could either of us refer to the datatype?) So how do we say, as part of our spec, that interpretations must honor the naming done by the external authority? Now this equation ISN'T vacuous: I(name(x)) = x It says that OUR interpretation (I) will agree with YOUR name-choice for the datatype (name(x)); we - RDF - agree to not re-interpret your name for your datatype. > >How is a URIref different from a name? Aren't the names of RDF URIrefs? Yes, they are. .... > >> >Sorry, but I consider the proposed change to violate the very >> >basis of using URIs to *denote* resources >> >> No, really, it doesn't. It just says that whoever invents the >> datatype also gets to say that some URI denotes (refers to) it, and >> we agree to go along with that. > >But that is *already* provided for! That's the way RDF works. Someone >says "the URI x denotes the resource y" and if that someone is the >"owner" of both the URI and the resource, then that URI is an >'official' URI for that resource. Er....no, that isn't how it works. At least, maybe it does in practice, but that isn't sanctioned by anything in the semantics (except good old 'social meaning', of course). Which is exactly why we need to actually say something extra to make it work this way in the case where we want the semantics to 'formally' require this. BTW, this is exactly the issue Ive been talking about as this larger issue that needs to be looked at in a broader context than just RDF. Right now there are no established general-purpose rules for giving things names on the Web. None. There are retrieval protocols (HTTP) and there are URNs, but there is no general account of how a URI reference actually gets to have a referent. >We don't need to posit that URI >as being *part of* the resource to infer that it is an 'official' URI. > >There is no need for this change, and I consider the change to be >a mistake regardless. That's two reasons not to make it. Well, there is a need to refer to the name of a datatype. I'll just assert that as editor. And I really don't think that this is a change. It will not affect anything that any other document says about datatypes, not change any test case or entailment. .... > >Naming a resource is completely orthogonal to its nature. > >There is not, nor ever should be, any requirement for the naming >of a resource by a URI to affect the nature of that resource. It is simply a fact that my name is "Patrick John Hayes". Knowing this about me is often very handy; and if you want to refer to me in text, its vital. Whether or not my 'nature' requires me to 'include' my name as 'part' of me is irrelevant to the fact that the best way to *refer to* me is often to use my name; and, more to the point here, to use my name to refer to something else is likely to cause trouble when communicating. >This seems such a fundamental principle that I'm boggled we are >even discussing this proposed change as reasonable! > >> >How do you know how "en"^^xsd:lang denotes the English language?! >> >> Because, presumably, some standards body has published a list of >> language tags and defined "en" to mean English, and we agree to go >> along with that. > >EXACTLY! > >The mechanisms for associating a given URIref with a particular >resource are *bootstrapping* mechanisms outside the scope of RDF, >or any layer based on RDF (including OWL). Right, but all Im wanting us to do is to say that we will conform to them. RDF does not get to say what name(d) actually is for some datatype d; but its not unreasonable to require, when someone tells us about some new datatype, that they tell us what name to use to refer to it with. If they don't, we will have to invent our own private name; and then how can they tell when we are referring to it? >We might be able to infer some of those mechanisms based on e.g. >examination of individual URIs (violating the opaqueness of those >URIs) but exactly *how* a given URIref is defined as denoting a >given resource is outside the scope of RDF. > >URIrefs are atomic. They just *are*. And they denote what they denote. They only denote *in a particular interpretation*. In a different one they might denote something else. If we want to lock down the denoting, then we need to add a semantic condition to do that, which is what the proposed equation does. >How they denote what they denote is not visible to the RDF layer. > >Come one. This is pretty fundamental stuff, no? Indeed. .... > >The name is the URIref, no? > >I just don't see why the MT needs some other name but the URIref. It doesn't: I mean the URIref. We can call it the 'datatype URIref'. I actually say its a URIref in any case. > >I think your foot has fallen through the floor of the atomic >foundation of RDF ;-) > >URIrefs are the only names you have, and the only names you need. > >Whether a given datatype (or any resource) has an 'official' URIref >to denote it should not affect the MT. If a URIref denotes a given >datatype, then that is the datatype you have. And the URIref denoting >it is the only name you need -- whether or not it is an 'official' >name. > >All D-interpretations dealing with the same datatype will have >the same interpretation. And whether a given datatype has one or >more names, the D-interpretation should be based on the datatype, >not the name denoting the datatype. BUt that doesn't work, which is what Peter noticed. The point being that an interpretation (lust like a datatype, in fact) is a mapping FROM a lexical space - urirefs - to a semantic space. Sometimes we need to refer to this mapping itself. If we can only talk about the semantic space, we can't make the urirefs mean what they are supposed to mean. > >What you seem to be suggesting really worries me. > >If I have ten different URIs denoting some person 'Bob', I would >expect that asking about the ex:father of Bob to have a consistent >interpretation, regardless of which URI I used to denote Bob in >the given query. Right. >And the MT shouldn't care about whether I used >an 'official' URI to denote Bob. All it should care about is that >I denoted Bob But how does it know that? You denoted Bob in all interpretations in which your URI - the one you happen to be using - denotes Bob. The only ways to ensure this are either to say so much, using that uriref, that Bob is the only logically possible denotee (this is usually impossible), or else to rely on some external 'grounding' principles which attach fixed meanings to enough of your vocabulary (say, to SS numbers or webpage addresses) so that you can use these to pin down your intended denotation. All Im saying is that in the case of datatypes, we allow the external authority to give us a pre-grounded uriref, and we will formally require that this uriref will indeed always be interpreted in this way. Then we and them can communicate and both be confident that these urirefs, at least, will have fixed denotations. >and Bob has a certain property ex:father. And of >course, likewise the 'FATHER' property could itself be denoted >by numerous URIs, and again, all that matters is that I am dealing >with a person resource Bob and a property resource FATHER and >interpretations are based on those resources, not the URIrefs >denoting them. > >Right? > >> >You seem to be suggesting that what >> >a given URI might denote can differ between different >> D-interpretations. >> >> Right, it can unless we say it can't. That's what I want to say. > >Well, try to say it without having to embed the URIref as part >of the Datatype. It seems to me that if this is a problem with >D-intepretations, then the same problem must exist for all >interpretations. It does, in a sense, but the D-interpretations are the only cases where we explicitly refer to an external naming authority, so the only cases where this issue arises in the spec. > >> >I thought one key quality of RDF was that the denotation of a given >> >URI never changes. It always means the same thing. >> >> Not at all: it can vary from interpretation to interpretation. > >Eh? What? I hope you're talking about some theoretical mathematical >possibility and not the reality of RDF. Nope, reality. >In RDF, a given URI always denotes the same thing. No, really, it does not. Absolutely, provably, not. For example, any consistent RDF graph can be satisfied by a Herbrand interpretation in which urirefs denote themselves, just like literals. >You could have >different Model Theories that had those URIs denoting different >things, sure, but I'm presuming that the official RDF MT is supposed >to ensure that in a valid RDF interpretation, a given URI always >denotes the same thing. Yes, *in a particular interpretation*. But if I send you an RDF graph, I don't send you an interpretation. You just get the syntax: you gotta interpret it yourself. > > >And if the >> >creators of a datatype give it a certain URI, then that's final. >> >That URI always denotes that specific datatype, whatever >> that datatype >> >is. Period. End of story. >> >> No, which is why we need to say this explicitly. > >Then it needs to be sated explicitly for all resources, not just >Datatypes. Well, but for most resources it would be vacuous, since they don't arrive as 'names' of anything in particular. > >Surely there is some foundational statement in the MT that expresses >the constraint that in RDF interpretations, a given URI always denotes >the same thing. In each interpretation it does. It does not however between interpretations. If you look at a uriref in isolation, RDF has no way to know what it is supposed to refer to. >??? > >If not, then I would consider that a bug in the MT. If its a bug, its a bug in the way the world is put together. Pray for guidance, but don't blame me. Pat -- --------------------------------------------------------------------- IHMC (850)434 8903 or (650)494 3973 home 40 South Alcaniz St. (850)202 4416 office Pensacola (850)202 4440 fax FL 32501 (850)291 0667 cell phayes@ai.uwf.edu http://www.coginst.uwf.edu/~phayes s.pam@ai.uwf.edu for spam
Received on Tuesday, 4 March 2003 09:59:46 UTC