RE: Learning from other disciplines?

Hi Jonathan,

> I have a favorite account of "ambiguity" that comes from a
> completely different direction: model theory.

It is an excellent explanation, which permits me to (hopefully)
shed some light on the relationship between that view of
ambiguity and the notions of "splitting"[1], URI declarations[2]
and identity-as-a-two-step-mapping[3] relate.

>
> As I've said before model theory (as in: RDF semantics, OWL DL
> semantics, OWL Full semantics, etc.) explains "ambiguity" not
> as a problem in definition of terms, but of interpretation of
> theories. That is, you start with a set of logical axioms and
> some logic,

Yes. So let's assume, for example, that you are given a set of
axioms (in RDF) and some logic, and that is your starting point.

> consider the deductive closure (= theory), and then look for
> models of the theory. One way to find a model is by looking at
> rdfs:comment, rdfs:label, and other properties, which, however
> ill formed or blurry they may be, might constitute adequate
> hints to lead you to a model.

Right. But another way to do it is to *first* look for
additional, *implied* axioms, based on principles of semantic web
architecture, that can be added to the initial set you were given
-- axioms that are somehow associated with the URIs that are used
in those initial axioms. This is the idea of behind URI
declarations.

For example, one might "follow one's nose"[4] by dereferencing
the URIs that were used in the original set of axioms, to obtain
additional axioms that would further constrain the possible
models that could fit the theory. This can be done recursively,
of course, to obtain the transitive closure, as any URIs in the
new axioms might also be dereferences.

Of course after this is done, the result is still a set of
axioms, and one must still find plausible models based on
information that is not expressed formally, such as rdf:comment,
rdf:label, etc.

>
> Now suppose you're talking to someone else about a theory, and
> you realize that the model they have in mind is one that you
> wish you had ruled out when you put the theory together. You
> have a choice: You can start speaking to the person in natural
> language to attempt to steer them toward the model you had in
> mind; or you can add constraining axioms to the theory,

There's a third option: leave the existing theory as is (because
changing it may break applications that depend on it) but define
a *new* theory that adds more constraining axioms, such that
every model that is consistent with the new theory is guaranteed
to be consistent with the old theory (but not vice versa). This
is the idea behind "splitting"[1]. Furthermore, in order to be
help users find the most appropriate theory to use for their
applications, document (in RDF) the relationship between the old
and new theories. This is the idea behind the
s:isBroaderThan/s:isNarrowerThan predicates.[5]

> and agree with your interlocutor to consider the modified
> theory in place of the original. The latter approach, when it
> succeeds in converging, is called "knowledge representation"
> (or so Pat tells me), while the former is more in the direction
> of "controlled vocabulary". Choosing to communicate informally
> instead of formally is a sort of a failure of the method, but
> is often expedient. I think each method has its place, and
> typical RDF and OWL practice probably sits somewhere in
> between.

Yes, each has its place. Clearly the informal method does not
scale well, so it isn't a very general solution.

>
> For example: Suppose I have a logical theory with three symbols
> P, A, and B, and I say that P is a relation that holds between
> A and B (in RDF: A P B.). It is very easy to come up with
> models of this theory; too easy in fact. We could have 2 < 3,
> ice is-frozen-state-of water, etc. Very "ambiguous". Now I tell
> you, at the meta-level or in rdfs:comment, that P means has
> father, A means Jonathan, and B means Gerald. This doesn't say
> Jonathan who, or what exactly is meant by "father", so there
> are still many plausible models, and nothing has changed from a
> logical point of view, but now you will probably not be
> interested in considering models in which A is not someone
> named Jonathan, B is not someone named Gerald, or P is not the
> has-father relation (i.e. B is not the father of A). Instead
> you will look for real-world scenarios to which the logic might
> apply (i.e. that are models of the theory). There is
> "ambiguity" (multiple interpretations) but less of it.

Right. And a convenient *way* for you to tell me that "A means
Jonathan" is to put that information in a URI declaration page
that is accessible by dereferencing a URI that you have used for
A. This is especially convenient if you express "A means
Jonathan" in RDF.

>
> In the model theoretic account ambiguity is simply the
> existence of multiple models,

Yes, that's the same kind of ambiguity I've been talking about.

> and in the model theoretic + hints account ambiguity is the
> existence of multiple plausible models, where plausibility is
> not an operational notion.

Yes, that's a good way to think of it.

> Interpretation ambiguity cannot always be isolated to
> individual terms, cannot always be detected, cannot always be
> proven (to everyone's satisfaction), and can never be
> eliminated. It is inherent in the framework because models
> can't be communicated. This is the reason we use formal
> theories - they *can* be communicated, and over centuries
> people have become pretty successful at articulating, agreeing
> on, and following the rules of the game.
>
> I steer toward a particular model as I add more axioms to the
> theory (last name, date of birth, clarification that a "father"
> is a "parent" but not a "mother", etc. etc.), because as the
> logical structure accumulates, accidental construction of
> unintended models becomes increasingly difficult. Pat tells me
> that there is some point in such an endeavor where it becomes
> so hard to interpret the logic incorrectly (at variance with
> intent) that one is justified in saying that "knowledge" has
> been "represented" logically.

Yes, if you're trying to reduce the ambiguity, that's what needs
to be done. But:

 - Reducing ambiguity is not always desirable, both because: (a)
there may be existing applications that would break if additional
axioms are added to reduce the ambiguity; and (b) adding axioms
generally means adding complexity (just as modeling the earth as
round is more complex than modeling it as flat).

 - Even when it *is* desirable to reduce ambiguity, one still must
answer questions of where to find those additional axioms and how
to choose between candidate sets of additional axioms. It isn't
good enough just to wave one's hands and say "somehow, more
axioms are added", as if it happens by magic. (Those were not
your words, of course -- I'm just illustrating.)

There are multiple roles involved in process of creating and
consuming RDF -- statement authors, URI owners, and consuming
applications, for example -- and it is important to clarify what
their various responsibilities and expectations should be.
Various approaches are possible, and some have better
characteristics than others. The point of [6] is to show that the
"URI declarations" approach has more desirable architectural
characteristics than the "competing definitions" approach.

>
> Anyhow this is my argument for forgetting about the metatheory
> (logical systems containing symbols such as "denotes",
> "Interpretation", "Model", "splitting", etc.), and focusing on a
> simple first-order logical model of a domain first. We have a
> perfectly good account of ambiguity of interpretation already.
> Attempting a theory of the metatheory will just push an
> unsolvable problem off to an even worse place.

But we shouldn't throw the baby out with the bath.  Just because 
some *parts* of this are unsolvable, that does not mean that it 
isn't worthwhile improving the parts that *are* solvable.  We 
cannot eliminate the magic involved in relating theories and 
selecting models, but we can reduce it.


1. Splitting:
http://dbooth.org/2007/splitting/

2. URI declaration:
http://dbooth.org/2007/uri-decl/

3. Identity as a two-step mapping:
http://dbooth.org/2007/uri-decl/20081126.htm#two-step

4. "Follow your nose" algorithm:
http://dbooth.org/2007/uri-decl/20081126.htm#nose

5. s:isBroaderThan/s:isNarrowerThan:
http://dbooth.org/2007/splitting/#isBroaderThan

5. Why URI Declarations? A comparison of architectural
approaches:
http://dbooth.org/2008/irsw/



David Booth, Ph.D.
HP Software
+1 617 629 8881 office  |  dbooth@hp.com
http://www.hp.com/go/software

Statements made herein represent the views of the author and do not necessarily represent the official views of HP unless explicitly so stated.

Received on Monday, 2 March 2009 04:21:34 UTC