- From: David Booth <david@dbooth.org>
- Date: Sat, 09 Nov 2013 00:00:54 -0500
- To: Pat Hayes <phayes@ihmc.us>
- CC: Antoine Zimmermann <antoine.zimmermann@emse.fr>, www-archive <www-archive@w3.org>, "Peter F. Patel-Schneider" <pfpschneider@gmail.com>, Ivan Herman <ivan@w3.org>, Sandro Hawke <sandro@w3.org>
Hi Pat, Thanks very much for your excellent summary. Detailed responses below. (And sorry it takes me so long to write these things up.) On 10/31/2013 02:32 AM, Pat Hayes wrote: > Hi David > > Rather than respond point-by-point, I will again try to summarize. > However, there are a few responses that are needed first: > >> ... at least in principle, anything that can be described in, say, >> English prose could instead be described in RDF. > > Most emphatically, no. Even if you substitute the most expressive > formal logic available (say, full higher-order modal tense logic) , > this would not be even remotely correct. RDF is so inexpressive that > it cannot manage something as simple as "Fathers are not mothers." > OWL cannot define the idea of an uncle, and full first-order logic > cannot define the idea of a natural number. I was assuming suitably expressive standard semantic extensions. But regardless, are you claiming that this would fundamentally change the argument? The context and purpose of this assumption is to bar any "then a miracle occurs" steps from the process of IRI definition. Again, a primary assumption is that for scalability, in the vast majority of cases IRI definitions must be provided using description -- not ostension. In principle, descriptions could either be formal (in RDF) or informal (in English prose, say). Are you claiming that English prose definitions can somehow avoid underdeterminism that leads to divergence) where RDF descriptions cannot? For example, are you claiming that there exists a resource property whose value can be specified in English prose but whose value cannot be specified in RDF (assuming suitably expressive generic semantic extensions)? If so, how? > >> But AFAICT, the trend is inevitably *toward* mismatch as more >> statements are published, assuming that: (a) parties publish data >> independently (without knowledge of each other) > > But why would you assume this? Usually, if B is publishing data using > A's IRIs (the only case that is of interest here) then B will have > access to *some* published information which will help determine what > A's intentions were regarding A's intended meaning. For example, if > you use DBpedia IRIs then there are large pages of information > available, in multiple languages. The entire Semantic Web/linked-data > enterprise is predicated on the idea that IRIs both denote entities > and also provide links to sources of more information about those > entities (or, if you like, more information about what the IRI is > intended to denote.) So the idea of two RDF authors using the same > IRI but without any knowledge of what the owner of the IRI intended > it to refer to, is SW/LD-pathological. It sounds like you misunderstood what I meant. I was not talking about RDF authors not having access to IRI definitions. I am assuming that A's IRIs have published definitions that are provided in RDF. I am talking about the situation where A publishes IRI definitions, and B1, B2 etc. are other parties who independently use A's IRIs in publishing other data. The assumption is that B1, B2 etc. are all aware of A's IRI definitions, but are not aware of each other (or each other's data). When this occurs, AFAICT the trend will inevitably be toward mismatch as B1, B2, etc. publish more data, even if each of those datasets is *individually* fully compatible with A's IRI definitions. > >> If the problem is disagreement then yes, you would have to choose >> between the source graphs. But if the problem is divergence then >> you have to do some more work -- resource identity splitting -- but >> can still use both source graphs after splitting. > > Changing the IRIs in a graph gives you a different graph. So you > would not be using both source graphs, but some modification of the > source graphs. And you would be obliged to *not* use - that is, to > reject - at least one of the source graphs, when they are mutually > inconsistent. Are you just dotting i's and crossing t's here? The point is that the original graphs are *not* rejected, but you're right that they are not merged **as is** either. The graphs are used by systematically transforming one or both of them (by properly substituting IRIs -- a process very similar to skolemization) and then merging the result, so as to join and use the relevant information content of *both* graphs. This is fundamentally different from disagreement, in which relevant information content must be *discarded*. By "relevant information content" I'm talking about the information content of the graph other than the particular choice of IRIs that were assigned -- a kind of graph isomorphism (though more general than the notion of isomorphism that is defined in the RDF Semantics, which is only isomorphism over bnode relabeling rather than IRI relabeling). > >> ... RDF data does not generally describe the real world, it >> describes a particular *conceptualization* of the real world > > It describes the world *using* a conceptualization (is there any > other way to describe anything?). It does not (usually) describe the > conceptualization. Maybe we're just quibbling over words here, but I don't see how RDF data that models the world as flat could possibly be describing the real world, since the real world is *not* flat. In essence, unless RDF is 100% accurate then it isn't describing the real world, is it? It could just as well be claimed to be describing my left earlobe. > >> false graphs aren't very useful, because they entail everything > > Just as a technical point: *logically false* graphs – contradictions, > false in *every* interpretation – entail everything. Mere falsity > does not get you quodlibet. Interesting point. Do you think that has any impact on this analysis? If so, how? The context of that statement was just pointing out that false graphs aren't very useful. > > --------- > > There are two substantive points of disagreement between us, and one > complete mismatch (divergence?) where I fail to understand what you > are saying. Let me deal with the two points first. > > 1. The reality thesis: that the real world is one of the satisfying > interpretations, and data is (usually) about the real world. I find > this obvious, so obvious indeed that it should not even need to be > said. You apparently find it either mistaken or meaningless, and in > any case think it is misleading as a guide to intuition. I am not > sure how to persuade you to my way of thinking, but let me ask you: > if all this linked data is not about reality, what do you think it > *is* about? Informally (and imprecisely) we can say that published RDF data is about the real world. But in general I think it would be more precise to say that it is *directly* about a particular conceptualization of the world, and only *indirectly* about the real world, through an informal and unspecified correspondence between that conceptualization and the real world. For example, some RDF data may describe a flat-earth conceptualization of the world, and that flat-earth conceptualization corresponds in some way to the real world. RDF data does not *directly* describe the real world, because the real world is *not* flat. Specifically, such RDF would ascribe properties to the earth that the earth does not possess. > And why do we find it useful, if it does not provide us > with information about the actual world? It is useful because, when we feed that data into our applications, our applications produce the desired outputs. It doesn't matter what the data is about as long as it produces the desired outputs. I'm not saying that to be flippant, but to emphasize that the whole notion of the real world is completely irrelevant to machine processing. Machines just push the bits around, and if they come out with the answers that we want, we're happy. What really matter is usefulness of the data. But I agree that, all other things being equal (which they aren't), conceptualizations are *much* more useful when they more closely resemble the real world. > Are the records of the > transactions in your bank account about your actual wealth? Would > that change if the bank started using RDF? > > Your objections to the idea include the observation cited above about > conceptualizations. Yes, of course data is stated *using* a > conceptualization, just like all assertions in every language or > formalism. But that does not make it any less about reality. It > really is a fact about the real world that Hilary and Tensing climbed > Everest in 1953; that we conceptualize the world here in terms of > people and mountains does not make this any less true. I am not sure > what the point of your "toucan" example is, but apparently the real > world can satisfy both the bird assertions and the website > assertions, by appropriate choice of an interpretation mapping. (If > the complete set of assertions is inconsistent then of course nothing > can satisfy it.) Your third point concerned approximations and > idealizations, such as the flat-earth geography of road maps. But > examples like this do not argue against the reality thesis. An > approximate or idealized description of X is still a description of > X. Bear in mind that if some RDF can be satisfied by an approximation > or simplification of the real world, then it can also be satisfied by > the more complicated real world, since one can add (an infinite > amount of) structure to an interpretation freely without making any > RDF triples false. (This is a consequence of RDF being a positive > logic without negation.) The map example is quite instructive, as > quite a lot of geolocation information (eg lat/long coordinates) is > in fact describing spherical space rather than flat space, even > though we project it onto flat surfaces. First of all, let me distinguish between two cases, to be sure that we are talking about the same case. One case is where the conceptualization is carefully crafted to acknowledge the fact that it is an approximation, so it might say, for example, something to the effect that "pi is greater than 3.1 and less than 3.2". I'm not talking about this case. The other case is where the conceptualization in effect says "pi equals 3.1". In other words, it asserts something that may be good enough for many applications, but in fact it turns out to be false when scrutinized in detail in the real world. This is the case that I'm talking about, and I used a flat earth conceptualization as an example, because it is useful for some applications but clearly wrong in terms of (directly) describing the real world. Furthermore, AFAICT this second case is where divergence inevitably leads (barring splitting, which mints new URIs). The reason is that divergence leads to a situation where one graph A in essence asserts "X P V" and another graph B asserts "X P NotV", which means that from some perspective, one of those graphs is *wrong*. And yet, within each of those graph's intended interpretations, the graph is *not* wrong, it is true. I.e., within each graph's *conceptualization* of the world, the graph's statements are consistent with that conceptualization. > > To say that some assertion is about the real world, or that it is > factual, is not to claim that it is in some metaphysical sense the > final truth or the definitive description, or that it is the last > word, or that its truth has to have ended science. It is just saying > that it is true. I think what you mean here is that there may be *additional* facts that are also true. Right? If you are telling me that there are various different notions of truth, then you'll have to explain more. > > You say: >> ... The "real world" interpretation is largely irrelevant -- both >> to the formal semantics and to understanding how the Semantic Web >> *actually* works. > > I strongly disagree. Many IRIs have fixed interpretations in the > actual world, I was wondering when you would bring that up. :) I partially agree, but I think the degree is far different. I would say that a *few* IRIs have fixed interpretations -- meanings that are fixed by ostension. But the vast majority are defined by description, and those by necessity are far less fixed. > determined by all kinds of social, technical and > linguistic conventions and meanings entirely outside RDF. We need to be careful about the assumptions that we make when stepping outside of RDF, to avoid "then a miracle occurs" steps whenever possible. My assumptions are: - A very few IRIs are defined by ostension. (This is one "then a miracle occurs" assumption.) - All other IRIs are defined by description. For simplicity, let's assume either English prose or RDF. - For the most part, English prose definitions do not fix the interpretations significantly more than RDF definitions, because: (1) machines cannot understand English prose, and the Semantic Web is intended for machine processing; and (2) an English prose definition could in principle be written instead as RDF (assuming suitable standard semantic extensions). I know you disagree with point #2 of that last assumption, and you may well be correct about that in a theoretical sense, but the question is: is there really a big difference in our ability to avoid unintended interpretations via English prose definitions versus RDF definitions? And if so, is that difference big enough to override the effect of point #1? If we assume that English prose descriptions fundamentally fix the interpretations substantially more than RDF descriptions, then it represents a second "then a miracle occurs" step in the process -- a step that surely is not scalable for the Semantic Web in the same way that RDF descriptions are scalable. > We still > want to be able to use RDF to describe these referents. For example, > I am a consultant on a project (http://www.imagesnippets.com/) to add > RDF markup to images. These RDF descriptions use IRIs which identify > (and in the RDF refer to) images, regions in the images, people and > places and colors and objects described in DBpedia and many other > real (no scare quotes) things in the real world. None of these > denotation mappings are specified by RDF descriptions, and most of > them could not be. Most – I would claim, virtually all – RDF linked > data uses IRIs like this to refer to real things. It is centrally > important that the formal semantics works with such identifying > IRIs. > > 'Edmund Hilary climbed Everest in 1953' says something true about the > actual, real, world. It expresses a fact. Just a mundane, simple bit > of data. So, how is this factuality of this fact related to > model-theory semantics? By the actual, real, world being one of the > satisfying interpretations of it. Because if the real world was not a > satisfying interpretation of this sentence, then it *couldn't > possibly* be true (in the real world.) > > But we can, if you like, simply agree to disagree about this, as it > has no direct bearing on the basic point we have been arguing about, > which is... > > 2. The idea of an IRI denoting something "in a graph". Your gloss > on this phase, as I now understand it from your email (the first time > you have explained your intended meaning) is as follows: you take all > the interpretations which satisfy the graph (and there will be > different such sets for different graphs, of course) and then you > ask, what does the IRI denote in those interpretations? And that is > what the IRI denotes "in the graph". (Do I have that right?) Yes, exactly right. > > But that does not define anything, because for any consistent graph > G, and any IRI U in that graph, there are interpretations which > satisfy G and in which U denotes things different from what it > denotes in other interpretations satisfying G. There is no graph > which 'pins down' the interpretations of the URIs which occur in it > in the way that your definition requires. (Here is a simple proof. > Let x be something which is not an IRI. The interpretation I with > universe {x} and IEXT(x)={<x,x>} and I(u)=x for every URI u, > satisfies G. The Herbrand interpretation H of G also satisfies G. But > H(U) = U =/= x = I(U), by construction. QED.) In fact, one can make a > stronger statement: truth in an interpretation does not depend on the > identity of the referents of IRIs *at all*, because one can take > *any* satisfying interpretation and produce another isomorphic one > with the identities permuted in any way one likes, as long as the > IEXT mappings are permuted to match. (In fact, this applies to *any* > axiomatizable, complete formal logic, no matter how expressive.) In a > nutshell: model theory does not determine reference. > > This should not be too surprising, actually, if you think about how > model theory is defined. The very definition of interpretation > presumes complete referential freedom: any IRI can denote anything. > And truth is determined solely by how those things stand in relations > to one another. The entire apparatus of model theory makes no > reference to the *actual identity* of the things in the universe > being described. So creating real constraints on reference - > attaching, as it were, a name to a thing - has to be done by other > means. In practice, we rely on notions of naming and reference > already in use in the larger world (as I did when using "Everest" to > refer to the highest mountain, and how ImageSnippets does when using > 'http://schema.org/Person' to refer to the class of human beings) and > sometimes on predefined mappings (as we do when fixing the referents > of literals using datatypes) and perhaps even by ostention (arguably, > http-range-14 can be seen as declaring HTTP GET/200 to be a form of > ostention.) And this all works quite nicely (a lot of the time) > because we can all (more or less) agree on what these referring names > actually refer to, at least well enough to transfer meanings > successfully by using them as referring names in sentences. > > So, as I believe I have said several times, phrases such as > "interpretation of an IRI in a graph" are not meaningful. It is not > that this is a different perspective on model theory, or an > alternative viewpoint. It is that it, quite literally, does not mean > anything. Yes and no. I fully agree with your proof, and the fact a Hebrand interpretation will always satisfy the graph, so let's not get hung up on that. But you have left out two key things. One is the existence of *some* IRIs that -- as you pointed out above -- *do* have their interpretations fixed, presumably by ostension. Those will (by the rules of entailment) cause the possible interpretations of *other* IRIs to be reduced also -- usually still not uniquely, but nonetheless reduced. A simple example is if <http://example/aa> has a fixed interpretation, and <http://example/aa> owl:sameAs <http://example/AA>, then by entailment <http://example/AA> also has a fixed interpretation. The second is the fact that every application that consumes RDF has a preconceived set of *assumed* interpretations. That set of assumed interpretations represents the particular conceptualization of the world in which that application operates. For example, in the toucan example, one application uses a conceptualization of the world involving web pages, and in that conceptualization, <http://example/toucan> becomes fixed to a web page -- not the string "http://example/toucan" (as in a Hebrand interpretation), and not the bird either, but the web page. Similarly, in a different application whose conceptualization of the world involves birds (but not web pages), the interpretation of <http://example/toucan> becomes fixed to a bird. Finally, if we back up one step from the idea that each consuming application has a set of assumed interpretations, we are back to the idea that each graph has a set of intended interpretations, which represent the set of assumed interpretations that the graph author intended to support. The net effect is that it *is* meaningful to talk about the idea of an interpretation of an IRI in a graph. In fact, I would conjecture that that is how many people in the Semantic Web think of IRI meaning, whether they are conscious of it or not. It's really a very sensible and intuitive way to think of it, IMO, though as we know, people's intuitions differ. > > --------- > > Now the place where I fail to understand what it is you are saying. > > At the end of your email you list all the advantages of an "other > way" of approaching model theory. But as far as I can tell, this > "alternative" is simply standard model theory. I agree! That's what I've been saying all along. But AFAICT, based on your other comments, it does seem to involve a slightly different way of *thinking* about the standard model theory. > For example: > >> The other way to think of the RDF Semantics is in terms of >> *multiple* interpretations > > This is the only correct way. As I have said to you before, *of > course* we think in terms of multiple interpretations. That is the > entire point of defining the notion of interpretation. The very > definition of entailment refers to multiple interpretations. Well yes, but see your next statement. > >> , instead of attempting to assume or impose a single "real world" >> interpretation. > > Well, it is fine to assume that the real world is *one* > interpretation, but nobody has ever suggested "imposing" a single > interpretation. Certainly, nothing in the RDF Semantics document > speaks of anything like this. Uh . . . but that is *exactly* what this offending "intuitive introduction" did (before it was removed) and why I insisted that it needed to be scoped to an interpretation: [[ An RDF graph is true exactly when: 1. . . . etc. ]] And you argued: http://lists.w3.org/Archives/Public/www-archive/2013Oct/0021.html [[ To say that a graph (or any other assertion or sentence) is true, is to say that when it is interpreted *in the actual world*, its truth-value is true. ]] That is *exactly* the kind of implicit assumption that I think is misleading, and why I have been trying hard to explain how a "multiple interpretations" view of the semantics is a slightly different view from thinking of the semantics in terms of an "in the actual world" interpretation. The intuition seems to be different. > >> By this I mean, for example, that: >> >> - Two different graph authors may have different sets of intended >> interpretations in mind when they publish their RDF graphs, and the >> same URI may indeed denote different resources in those >> interpretations. > > Different sets of interpretations in mind, yes, of course (standard). > URIs denoting different things in different interpretations, yes of > course (standard). URIs denoting different things in different *sets* > of interpretations, yes, if we are talking about sets of > interpetations an author *has in mind*. But URIs denoting things in a > set of all interpretations which satisfy a given graph? No, for the > reasons described above. That idea is incoherent. I've addressed that above. > >> - The most accurate way to understand a graph is to interpret it in >> the way that the author intended it to be interpreted. Since we >> have no other reliable way of knowing what that might be, we can >> assume that the author's intended interpretations for a graph are a >> subset of the graph's **satisfying interpretations**. I.e., we >> take the graph's meaning at face value, rather than attempting to >> interpret it according to some hidden, assumed "real world" >> interpretation. > > Yes, this is exactly what the RDF model theoretic semantics presumes. > Asserting a graph effectively claims that interpretations must be > such as to make it true, i.e. to satisfy it. Each graph makes some > claims about how the world is structured, Yes. :) > and the claims made by > multiple graphs are connected by their common use of global IRIs. *If* the graphs are merged. If the graphs are not merged, then the RDF semantics says nothing about them using the same interpretations. Of course, any two graphs *may* be merged. But the point is that it is sometimes useful to look at the semantics of each graph *individually* (if the graphs are being used separately, by separate applications) just as it is also sometimes useful to look at the semantics of the merge (such as if those graphs are being used together). > >> Some benefits of looking at the formal semantics this way > > What "way" are you talking about? Look, *of course* each graph has a > set of satisfying interpretations, and asserting the graph is saying > that the world being described by the graph is one of those > satisfying interpretations. (Or if we want to give authors the > ability to be vague about exactly what they are talking about, then > the interpretations of whatever the author had in mind are a subset > of the satisfying interpretations.) And of course we should take a > graph at face value, as you put it, as saying exactly this. All this > is *exactly* what the current semantics itself says (or presumes). As > far as I can see, you are simply agreeing with standard > model-theoretic intuitions here. I am *entirely* agreeing with the model-theoretic semantics. But apparently, as has been evidenced in this discussion, our intuitions sometimes differ. > >> Is this making any more sense to you? > > No. I don't know what the "it", that is supposed to provide all these > advantages, actually is. If it is the idea that asserting a graph > amounts to saying that the intended interpretation is one of those > satisfying the graph, then this is what model theory says already. If > it is the idea that an IRI can refer to one thing in one graph and a > different thing in a different graph, then that is false (by > definition) Wrong. Stop right there. Please re-read that last sentence, and notice that it did *not* stipulate a particular interpretation. I.e., it was *not* scoped to a single interpretation. If you had instead said: "the idea that, **within a particular interpretation**, an IRI can refer to one thing in one graph and a different thing in a different graph, then that is false", then that would be perfectly true. But the whole point of my long explanation above about a graph's intended interpretations is that is is perfectly normal and natural to think of applying different interpretations to different graphs. Thus, when a single interpretation has not stipulated, it is *wrong* to assume that a single interpretation is being applied. I accept the fact that your intuition may be different than mine -- and perhaps everyone else's also. But you *cannot* make unqualified statements like that, containing free variables, and expect everyone else to have the same intuition about what they mean as you do. If you want to make a statement that is only true in one interpretation, then you *must* stipulate that single interpretation. So, returning to what "it" is, a central theme here is the idea that different graphs have different sets of intended interpretations. Therefore, it is entirely normal and sensible to think and talk about the same IRI denoting different things in different graphs: an IRI may denote one thing in graph A's intended interpretations, and a different thing in graph B's intended interpretations. Notice that if I were to commit the same sin that you committed above (in making a statement with unbound variables) I would shorten that last sentence into: An IRI may denote one thing in graph A and a different thing in graph B. And the problem seems to be that when you read "An IRI may denote one thing in graph A and a different thing in graph B" *your* intuition causes you to read that sentence as "**Within a given interpretation**, an IRI may denote one thing in graph A and a different thing in graph B", and the sentence is obviously false by that reading. But *my* intuition causes me to read it as "An IRI may denote one thing in graph A's **intended interpretations** and a different thing in graph B's *intended interpretations**", and the sentence is obviously true by that reading. > but in any case would not provide all these claimed > advantages that 'it' is supposed to have, even if it could be made > somehow true. Please tell me exactly which ones you disagree with, and why. Every one of them is supported by an unbroken chain of reasoning, so if you're not reaching the same conclusions as me, then we need to identify what differing assumptions we're making along the way and why. > >> Have I explained myself in sufficient detail, or do you still think >> that "David . . . does not properly understand the intuitive >> foundations of semantics" and my points are mere "inanity", as you >> previously concluded? > > I regret if my usage here seemed impolite, but I do (still) find your > posts, including this one, to be a strange mixture of basic ideas > about model theory (re)stated as though they were somehow a new > insight or an alternative to the standard view (which I referred to > as "inanity") and strangely stubborn basic mistakes which do, I am > afraid, strongly suggest that you have not grokked the basic ideas of > model theory. I am kind of amazed that you *still* seem to think that I am making some kind of mistake in my understanding of model theory, since it seems to me that we have repeatedly confirmed that my understanding corresponds *exactly* with standard model-theoretic semantics. Precisely what basic ideas of model theory do you think I have not grokked? > >> And do you *still* think I merely need to go read a book on model >> theory, or have we now (I hope) got past that? If not, what >> aspects of model theory do you still think I misunderstand? > > Well, I guess, the basic idea of an interpretation. An RDF > interpretation, by definition, is a mapping from IRIs to referents. > It is not a mapping from IRIs-in-graphs or from IRI-occurrences or > from IRIs-in-a-context. Ergo, every interpretation treats all > occurrences of an IRI in the same way, as referring to the same > thing, regardless of which graphs the IRI happens to occur in. Yes. I don't know what I have to do to convince you that I understand that, but believe me, I do. > Therefore, the notion of what an IRI denotes "in a graph" is > meaningless. No, as already explained in some detail above, that does not follow. The idea that an IRI denotes something "in a graph" certainly *is* meaningful if you think of it in terms of the graph's **intended interpretations**. If you think of it that way, it is a very sensible notion indeed. > This basic fact – and it is a very basic and > foundational point – still seems to elude you. I hope the above explanations have helped you to see why I keep pointing out that that conclusion is wrong. > To emphasize, this is > not a "perspective" which admits alternatives, it is simply a fact > about how interpretations are defined. The fact that an interpretation is (by definition) a mapping from IRIs to referents has not eluded me at all. I am fully aware of that. The perspective part comes in after that, when talking about a graph's intended interpretations. One perspective is to assume a single (real-world) interpretation to by which all graphs are evaluated. Another perspective is to evaluate each graph by its intended interpretations. Both views are fully consistent with standard model theory, but AFAICT they seem to appeal to different intuitions. > >> The bottom line here is that some of the statements -- and >> intuition -- in the existing RDF drafts are just plain *wrong* and >> need to be corrected. In particular, the statement in RDF Concepts >> that says "IRIs have global scope: Two different appearances of an >> IRI denote the same resource" is just factually *wrong*. > > It is a presumption of the RDF data model. The semantics, in > particular, is based on it. I don't quite see how it can be factually > wrong, since RDF *defines* the notion of denotation. (If it had said > "identiifies" then it might be factually wrong, but it doesn't.) Hopefully my explanations above have by now clarified why this is wrong as stated, so I won't go over it again here. But if you disagree please let me know where and why, so that I can address it. Best, David
Received on Saturday, 9 November 2013 05:01:27 UTC