- From: Booth, David (HP Software - Boston) <dbooth@hp.com>
- Date: Mon, 2 Mar 2009 04:20:15 +0000
- To: Jonathan Rees <jar@creativecommons.org>, Michael Hausenblas <michael.hausenblas@deri.org>
- CC: Alan Ruttenberg <alanruttenberg@gmail.com>, AWWSW TF <public-awwsw@w3.org>
Hi Jonathan, > I have a favorite account of "ambiguity" that comes from a > completely different direction: model theory. It is an excellent explanation, which permits me to (hopefully) shed some light on the relationship between that view of ambiguity and the notions of "splitting"[1], URI declarations[2] and identity-as-a-two-step-mapping[3] relate. > > As I've said before model theory (as in: RDF semantics, OWL DL > semantics, OWL Full semantics, etc.) explains "ambiguity" not > as a problem in definition of terms, but of interpretation of > theories. That is, you start with a set of logical axioms and > some logic, Yes. So let's assume, for example, that you are given a set of axioms (in RDF) and some logic, and that is your starting point. > consider the deductive closure (= theory), and then look for > models of the theory. One way to find a model is by looking at > rdfs:comment, rdfs:label, and other properties, which, however > ill formed or blurry they may be, might constitute adequate > hints to lead you to a model. Right. But another way to do it is to *first* look for additional, *implied* axioms, based on principles of semantic web architecture, that can be added to the initial set you were given -- axioms that are somehow associated with the URIs that are used in those initial axioms. This is the idea of behind URI declarations. For example, one might "follow one's nose"[4] by dereferencing the URIs that were used in the original set of axioms, to obtain additional axioms that would further constrain the possible models that could fit the theory. This can be done recursively, of course, to obtain the transitive closure, as any URIs in the new axioms might also be dereferences. Of course after this is done, the result is still a set of axioms, and one must still find plausible models based on information that is not expressed formally, such as rdf:comment, rdf:label, etc. > > Now suppose you're talking to someone else about a theory, and > you realize that the model they have in mind is one that you > wish you had ruled out when you put the theory together. You > have a choice: You can start speaking to the person in natural > language to attempt to steer them toward the model you had in > mind; or you can add constraining axioms to the theory, There's a third option: leave the existing theory as is (because changing it may break applications that depend on it) but define a *new* theory that adds more constraining axioms, such that every model that is consistent with the new theory is guaranteed to be consistent with the old theory (but not vice versa). This is the idea behind "splitting"[1]. Furthermore, in order to be help users find the most appropriate theory to use for their applications, document (in RDF) the relationship between the old and new theories. This is the idea behind the s:isBroaderThan/s:isNarrowerThan predicates.[5] > and agree with your interlocutor to consider the modified > theory in place of the original. The latter approach, when it > succeeds in converging, is called "knowledge representation" > (or so Pat tells me), while the former is more in the direction > of "controlled vocabulary". Choosing to communicate informally > instead of formally is a sort of a failure of the method, but > is often expedient. I think each method has its place, and > typical RDF and OWL practice probably sits somewhere in > between. Yes, each has its place. Clearly the informal method does not scale well, so it isn't a very general solution. > > For example: Suppose I have a logical theory with three symbols > P, A, and B, and I say that P is a relation that holds between > A and B (in RDF: A P B.). It is very easy to come up with > models of this theory; too easy in fact. We could have 2 < 3, > ice is-frozen-state-of water, etc. Very "ambiguous". Now I tell > you, at the meta-level or in rdfs:comment, that P means has > father, A means Jonathan, and B means Gerald. This doesn't say > Jonathan who, or what exactly is meant by "father", so there > are still many plausible models, and nothing has changed from a > logical point of view, but now you will probably not be > interested in considering models in which A is not someone > named Jonathan, B is not someone named Gerald, or P is not the > has-father relation (i.e. B is not the father of A). Instead > you will look for real-world scenarios to which the logic might > apply (i.e. that are models of the theory). There is > "ambiguity" (multiple interpretations) but less of it. Right. And a convenient *way* for you to tell me that "A means Jonathan" is to put that information in a URI declaration page that is accessible by dereferencing a URI that you have used for A. This is especially convenient if you express "A means Jonathan" in RDF. > > In the model theoretic account ambiguity is simply the > existence of multiple models, Yes, that's the same kind of ambiguity I've been talking about. > and in the model theoretic + hints account ambiguity is the > existence of multiple plausible models, where plausibility is > not an operational notion. Yes, that's a good way to think of it. > Interpretation ambiguity cannot always be isolated to > individual terms, cannot always be detected, cannot always be > proven (to everyone's satisfaction), and can never be > eliminated. It is inherent in the framework because models > can't be communicated. This is the reason we use formal > theories - they *can* be communicated, and over centuries > people have become pretty successful at articulating, agreeing > on, and following the rules of the game. > > I steer toward a particular model as I add more axioms to the > theory (last name, date of birth, clarification that a "father" > is a "parent" but not a "mother", etc. etc.), because as the > logical structure accumulates, accidental construction of > unintended models becomes increasingly difficult. Pat tells me > that there is some point in such an endeavor where it becomes > so hard to interpret the logic incorrectly (at variance with > intent) that one is justified in saying that "knowledge" has > been "represented" logically. Yes, if you're trying to reduce the ambiguity, that's what needs to be done. But: - Reducing ambiguity is not always desirable, both because: (a) there may be existing applications that would break if additional axioms are added to reduce the ambiguity; and (b) adding axioms generally means adding complexity (just as modeling the earth as round is more complex than modeling it as flat). - Even when it *is* desirable to reduce ambiguity, one still must answer questions of where to find those additional axioms and how to choose between candidate sets of additional axioms. It isn't good enough just to wave one's hands and say "somehow, more axioms are added", as if it happens by magic. (Those were not your words, of course -- I'm just illustrating.) There are multiple roles involved in process of creating and consuming RDF -- statement authors, URI owners, and consuming applications, for example -- and it is important to clarify what their various responsibilities and expectations should be. Various approaches are possible, and some have better characteristics than others. The point of [6] is to show that the "URI declarations" approach has more desirable architectural characteristics than the "competing definitions" approach. > > Anyhow this is my argument for forgetting about the metatheory > (logical systems containing symbols such as "denotes", > "Interpretation", "Model", "splitting", etc.), and focusing on a > simple first-order logical model of a domain first. We have a > perfectly good account of ambiguity of interpretation already. > Attempting a theory of the metatheory will just push an > unsolvable problem off to an even worse place. But we shouldn't throw the baby out with the bath. Just because some *parts* of this are unsolvable, that does not mean that it isn't worthwhile improving the parts that *are* solvable. We cannot eliminate the magic involved in relating theories and selecting models, but we can reduce it. 1. Splitting: http://dbooth.org/2007/splitting/ 2. URI declaration: http://dbooth.org/2007/uri-decl/ 3. Identity as a two-step mapping: http://dbooth.org/2007/uri-decl/20081126.htm#two-step 4. "Follow your nose" algorithm: http://dbooth.org/2007/uri-decl/20081126.htm#nose 5. s:isBroaderThan/s:isNarrowerThan: http://dbooth.org/2007/splitting/#isBroaderThan 5. Why URI Declarations? A comparison of architectural approaches: http://dbooth.org/2008/irsw/ David Booth, Ph.D. HP Software +1 617 629 8881 office | dbooth@hp.com http://www.hp.com/go/software Statements made herein represent the views of the author and do not necessarily represent the official views of HP unless explicitly so stated.
Received on Monday, 2 March 2009 04:21:34 UTC