RE: On "in defense of Ambiguity" (was RE: Uniform access to descriptions) from Booth, David (HP Software - Boston) on 2008-03-29 (www-tag@w3.org from March 2008)

From: Booth, David (HP Software - Boston) <dbooth@hp.com>
Date: Sat, 29 Mar 2008 03:24:04 +0000
To: Harry Halpin <hhalpin@ibiblio.org>, Pat Hayes <phayes@ihmc.us>
CC: "www-tag@w3.org WG" <www-tag@w3.org>
Message-ID: <184112FE564ADF4F8F9C3FA01AE50009FCF1DCD228@G1W0486.americas.hpqcorp.net>

Hi Harry & Pat,

> http://www.ibiblio.org/hhalpin/homepage/publications/indefense
> ofambiguity.html

This paper quite well presents a line of thinking about Web architecture and Semantic Web architecture, based on certain assumptions, and brings it to its logical conclusion: that ambiguity of reference is inescapable, and we might as well therefore embrace it. It also makes a step toward addressing the problem by proposing a predicate, eg:hasDescription. Although I think the paper makes a key assumption that is invalid, thus undermining the main conclusion, and I think the proposed predicate is currently flawed, I nonetheless think the paper is a very useful read, as it does a good job of describing a view of the problem that can act as a good springboard for further discussion.

Detailed comments follow, but first a general comment is that I think it is useful to disginguish between classic, document-oriented Web architecture and Semantic Web architecture, and the paper often doesn't make this distinction clear. Semantic Web architecture is layered (non-destructively) on top of classic Web architecture: all of the responsibilities and benefits of classic Web architecture apply in Semantic Web architecture, but not vice versa. The identity issue is relevant only to Semantic Web architecture, but aspects of the issue are caused by the way Semantic Web architecture leverages features of classic Web architecture.

1. The abstract says: "Reference is by nature ambiguous in any language. So any attempts by Web architecture to make reference completely unambiguous will fail on the Web."

That is only partly true. For human languages, and references to real world things, it is true. But it is *not* true in the more constrained world of computer languages and machine processing, and the implicit assumption that it is true for Semantic Web architecture is, IMO, a critical invalid assumption in the line of thinking that the paper presents.

In a programming language, the referent of an identifier x can be *completely* unambiguous - corresponding to a specific memory location within an abstract machine or model. The fact that this memory location is in turn interpreted by its human users as a stand-in for a real-world entity (such as a person or a bank account) is a *separate* issue from the need for the referent of that identifier to be unambiguous within the abstract machine.

In Semantic Web architecture, as in programming languages, the mapping from a name to a real-world entity is indirect, via the model defined by that architecture or language. It is a two-step mapping:

Step 1: Mapping from a name to something in that model, e.g., the name "x" to a particular set of assertions, or a particular memory location.

Step 2: Mapping from the thing in the model to a real-world entity, e.g., interpreting those assertions, or that memory location, as a stand-in for an actual bank account or a particular person.

The Semantic Web architecture (or the programming language) only specifies the first step of this mapping. This step can and *must* be unambiguous for the Semantic Web to reach its full potential. The second step in this mapping, as you correctly point out, can never be completely unambiguous. Fortunately, the goal of the Semantic Web is *not* to create a better human or machine language for referring to real world things. It is to facilitate useful machine processing. Thus, although the second step is obviously important to people, it is *not* something that the Semantic Web architecture can directly specify or control. The best that the Semantic Web architecture can do is to specify the first step in a way that is unambiguous and best enables useful machine processing.

2. The "THE IDENTITY CRISIS ON THE WEB" section mentions: '. . . on the Web one can have a URI for the "Eiffel Tower in itself," such as http://www.example.singandich.org/EiffelTower.'

That's not quite how the Semantic Web architecture works. You and I and others may *interpret* that URI as referring to the Eiffel Tower itself, but the architecture cannot guarantee or even specify such an interpretation. It can only specify the first of the two steps in getting from that URI to that interpretation: the mapping from the URI to a particular set of assertions or (for a built-in) a particular chunk of code.

3. The "REFERENCE AND ACCESS" section nicely illustrates the difference between reference and access.

4. The "REFERENCE AND ACCESS" section says: "Kripke's account of unambiguous names can then be transposed to the Web with a few minor variations (1980). In this story, a URI is like a proper name and baptism is given by the registration of the domain name, which gives a legally binding owner to a URI."

Not quite. The purchase of a domain name is *not* the act of baptism for a URI, it merely gives the domain owner the *right* to baptize URIs minted within that domain. The actual baptism of a URI is typically done by the act of publishing a follow-your-nose document via that URI -- a URL declaration -- as described at
http://dbooth.org/2007/uri-decl/#precise-def-uri-decl

5. The "REFERENCE AND ACCESS" section says: "So, if one got a URI like http://www.example.singandich.org/EiffelTower and one wanted to know what the URI referred to, one could use a service such as whois to look up the owner of the URI, and then simply call them and ask them what the URI referred to."

Right, but if the URI owner publishes a URI declaration that is accessable via the URI (using the usual follow-your-nose algorigthm), then one would only need to look at the published URI declaration instead of actually calling the domain owner. However, we must bear in mind that this only addresses the first step in the problem of determining the referent of the URI, but this first step is the only one that the architecture can specify.

6. The "REFERENCE AND ACCESS" section says: "Under this assumption, if a user were given the URI http://www.example.singandich.org/EiffelTower then the user would be part of the community of the Web and the user is then forced to buy into the owner's claim that the URI refers to the Eiffel Tower. This argument is trivially not true on the Web. The owner cannot communicate via telepathy what the URI refers to. In a decentralized system such as the Web, a user of the URI can usually tell what a URI is supposed to refer to by accessing Web pages through the URI, and Web pages are another form of description for things and so subject to ambiguity."

The problem with this line of thinking is that it has not distinguished between step 1 ambiguity and step 2 ambiguity, as described in comment #1 above, and thus it throws the baby out with the bath. Step 1 is all that matters to the SWeb architecture, and it *can* be unambiguous. You are totally correct that step 2 cannot be unambiguous. However, step 2 is outside of the SWeb architecture's control.

7. The "REFERENCE IS INHERENTLY AMBIGUOUS" section says: "In contrast, reference to natural entities is inherently ambiguous. In this manner, reference on the Web is the same as reference off the Web. . . . . Web architecture does not determine what any names, including URIs, refer to. It only determines what they access."

That is only true for classic, document-oriented Web architecture. For Semantic Web architecture it is only partly true. Semantic Web architecture *does* determine step 1 of what a URI refers to, though as explained in comment #1 it does not (and cannot) determine step 2.

8. The "REFERENCE IS INHERENTLY AMBIGUOUS" section has several philosophical paragraphs citing Korzybsky, Frege, Goedel, Luntley, Quine, Davidson and Kripke. It is hard to assess the relevance of these to SWeb architecture because no distinction has been made between steps 1 and 2 of the referent mapping problem. However, I *think* these paragraphs pertain to step 2, and thus fall outside of SWeb architecture.

9. The "REFERENCE IS INHERENTLY AMBIGUOUS" section says: 'One disturbing result of this is that we may need different versions of the concept of a "person."'

This is a very important observation, and I'm very glad you made it (again).

10. The "REFERENCE IS INHERENTLY AMBIGUOUS" section says: 'What makes this kind of consideration particularly acute is the view that URIs should be global and eternal in scope. This makes things worse. It means that if there is any possibility of some such ontological distinction being made by anyone, anywhere, at any time, then in order to avoid ambiguity of reference, the URI must make all the potentially disambiguating decisions ahead of time. This is of course manifestly impossible, because there is always the possibility of some new distinction being made later. It is impossible to achieve unambiguous universal reference of names by using descriptions. So we should not set out to attempt unambiguous reference, nor pose it as a goal or a "good practice."'

No. This fallacy is rooted in the lack of distinction between steps 1 and 2 of the referent mapping problem. The need for increasing ontological distinctions merely means that a single URI may not be enough to suit all needs: it may be necessary to mint another, related URI when some new "distinction" is needed later, as will be described here (though the document isn't done yet):
http://dbooth.org/2007/splitting/

In SWeb architecture, the need for a new "distinction" boils down to the need for additional core assertions in the URI declaration. (See
http://dbooth.org/2007/uri-decl/ .)
The fact that step 2 of the referent mapping cannot be unambiguous does *not* imply that good practice guidelines are pointless. Rather, it means that the SWeb architecture can *tolerate* the ambiguity that is inherent in step 2, and the goal of "uniqueness" can only be evaluated relative to a particular range of expected applications. If a particular distinction is very likely to be needed, and it isn't otherwise costly to include, then as a good practice such a distinction *should* be made. For example, one *could* mint a single URI that denotes any of the three "David Booth"s who currently work for HP, but distinguishing more finely and minting a separate URI for each of those "David Booth"s is likely to be much more useful to many more applications.

11. Regarding the "IF AMBIGUITY IS INEVITABLE, LET'S MAKE LEMONADE" section, I like much of this section -- the first half or so -- though I have some quibbles with latter parts (noted separately).

12. The "IF AMBIGUITY IS INEVITABLE, LET'S MAKE LEMONADE" section argues that if http://www.example.singandich.org/EiffelTower 303-redirects to http://www.tour-eiffel.fr/ , which in turn provides a 200 Okay response, then we learn that http://www.tour-eiffel.fr/ denotes an information resource, but "We know nothing about the original URI, http://www.example.singandich.org/EiffelTower" because "the 303 status code can not possibly tell us that the resource redirected from was used for referring".

This seems to reflect a misunderstanding of how the 303 redirect is intended to work in Semantic Web architecture. True, the 303 redirect by *itself* says nothing about the nature of the URI http://www.example.singandich.org/EiffelTower. (For brevity, let me call this URI uTower.) What it does is establish a chain of authority: it indicates that the owner of uTower *wants* you to look at the document at http://www.tour-eiffel.fr/ . (I'll call this second URI uDoc.) In classic (document oriented) Web architecture this may not much matter to you. But in Semantic Web architecture it is critical, because it tells you that the owner of uTower has in some sense "endorsed" statements made at uDoc. In this example, uDoc (http://www.tour-eiffel.fr/) does not appear to make any statements involving uTower -- at least it didn't when I tried it in my browser -- so nothing useful is learned about what uTower is intended to denote. But if uDoc had instead served a document containing RDF assertions about the resource that uTower denotes, then in Semantic Web architecture this "endorsement" crucially tells you that the owner of uTower has delegated authority to the owner of uDoc to perform a speech act -- in this case the act is the publication of such RDF assertions via a 303-redirect from uTower -- that *creates* the (indirect) association between URI uTower and the Eiffel Tower. Again, as explained in comment #1, this association is necessarily indirect: it is a two-step mapping, first (unambiguously) from the URI to a set of assertion, and second (ambiguously) from those assertions to the actual Eiffel Tower.

13. The "IF AMBIGUITY IS INEVITABLE, LET'S MAKE LEMONADE" section says: "In practice, web architecture does not determine what any names, including URIs, refer to. It only determines what they access. The relationship of reference is determined by the users of the URI."

That's correct for classic (document oriented) Web architecture, but again only half true for Semantic Web architecture. True, step 2 in determining the referent of a name (or URI) is determined by users -- not by Semantic Web architecture. But step 1 can and *must* be specified by Semantic Web architecture, and as I've pointed out in
http://lists.w3.org/Archives/Public/www-tag/2008Mar/0084.html
for the Semantic Web to be most successful step 1 must *not* be determined by users, it must be determined by the URI owner.

14. The "IF AMBIGUITY IS INEVITABLE, LET'S MAKE LEMONADE" section says: "Pragmatically, there are problems with the TAG's suggested redirection. It uses a distinction in how a text is delivered (an HTTP code) to disambiguate the accessible Web page itself; a category mistake analogous to requiring the postman dance a jig when delivering an official letter."

As I explained in comment #12, it isn't the 303 redirect itself that disambiguates, it is the information served via that 303 redirect that disambiguates, assuming its owner chooses to serve disambiguating information.

Furthermore this "contrived" use of 303 is *not* a category mistake. A contrived act like dancing a jig is *exactly* what a performative speech act or baptism is all about: some distinguishable act that, by social convention, is given special significance beyond its base semantics. In Semantic Web architecture this special significance is the *creation* of the association between a URI and (indirectly) a resource.

15. The "IF AMBIGUITY IS INEVITABLE, LET'S MAKE LEMONADE" section says: "[This use of 303] produces harmful effects by misusing HTTP codes for an alien purpose".

I don't think this is a "misuse", since the Semantic Web use of 303 conforms 100% to Web architecture and the HTTP protocol. Rather, it is *layering* Semantic Web architecture on top of classic (document oriented) Web architecture.

16. The "IF AMBIGUITY IS INEVITABLE, LET'S MAKE LEMONADE" section says: "using hash URIs has the exact same problem as 303 redirection, since it doesn't normatively define any sort of relationship between the two URIs".

This seems to reflect the same misunderstanding that I explained in comment #12 above regarding 303 redirects.

17. The "DISTINGUISHING BETWEEN REFERENCE AND ACCESS ON THE WEB" section suggests: "One could state that URIs only refer to accessible things just when the accessible thing is actually assigned that name; and assigning a name is done only by an explicit naming convention".

I think that would be a big mistake, because it would mean that there would be no authoritative URI declarations for the billions of existing Web pages that don't bother to explicitly declare their own URIs. This would mean that authors of SWeb statements either would not be able to refer to those pages by their URIs or they would risk causing URI collision, by potentially assuming differing definitions, as described in
http://lists.w3.org/Archives/Public/www-tag/2008Mar/0084.html
It seems natural to me that successful access (an HTTP 200 response) should be treated as an implicit declaration of the URI.

18. The "DISTINGUISHING BETWEEN REFERENCE AND ACCESS ON THE WEB" section proposes a predicate ex:describedBy (and its inverse ex:refersTo), but ex:describedBy sounds very much like rdfs:isDefinedBy. How would it differ? Why couldn't rdfs:isDefinedBy serve the intended purpose of ex:describedBy?

Also, consider the N3 statement suggested in the paper:

<http://www.example.tourism.org/EiffelTower#>
ex:describedBy <http://www.tour-eiffel.fr/> .

This statement says nothing whatsoever about the URI http://www.example.tourism.org/EiffelTower# or its intended usage. It cannot. It is a statement about the resource *denoted* by that URI. To make this point more evident, let's assume that <http://fribjam.example#tweedledee> happens to be owl:sameAs <http://www.example.tourism.org/EiffelTower#> and consider the following statement:

<http://fribjam.example#tweedledee>
ex:describedBy <http://www.tour-eiffel.fr/> .

Clearly this second statement says nothing about the URI http://www.example.tourism.org/EiffelTower#, yet it has the *exact* same semantics as the first statement!

The problem here is that notions of reference and access have to do with the relationship between a name and its usage, or the thing it denotes. And it is not possible to talk about that relationship without talking about the name itself. This is why one of the three components of a URI declaration is the URI itself, as discussed here:
http://dbooth.org/2007/uri-decl/#precise-def-uri-decl
and why the decl:hasDeclaration predicate has a domain of xsd:anyURI in
http://esw.w3.org/topic/AwwswDboothsRules
(see line 229 of rules.n3), and why the range of the dbooth:declares predicate is xsd:anyURI in
http://dbooth.org/2007/uri-decl/#declares

Best regards,

David Booth, Ph.D.
HP Software
+1 617 629 8881 office | dbooth@hp.com
http://www.hp.com/go/software

Opinions expressed herein are those of the author and do not represent the official views of HP unless explicitly stated otherwise.

Received on Saturday, 29 March 2008 03:26:07 UTC