- From: John Black <JohnBlack@kashori.com>
- Date: Tue, 12 Jun 2007 16:53:05 -0400
- To: "Tim Berners-Lee" <timbl@w3.org>
- Cc: "M.David Peterson" <m.david@xmlhacker.com>, "r.j.koppes" <rikkert@rikkertkoppes.com>, "Yuzhong Qu" <yzqu@seu.edu.cn>, "Sandro Hawke" <sandro@w3.org>, <semantic-web@w3.org>, <swick@w3.org>, <phayes@ihmc.us>
Tim Berners-Lee wrote > On 2007-06 -11, at 13:53, John Black wrote: >> Tim Berners-Lee wrote >>> On 2007-06 -09, at 21:22, M. David Peterson wrote: >>>> On Sat, 09 Jun 2007 07:13:52 -0600, Tim Berners-Lee <timbl@w3.org> >>>> wrote: >>>> >>>>> No. It cannot identify both a document and a person. >>>> >>>> Tim: Will all due respect... WTF? >>> >>> >>> I am using the 'identify' in the strict sense of 'denote'. >>> The semantic web is like a logic language in which URIs are symbols. >> >> Do you believe that by claiming to use the strict, logical sense of the >> word 'denote' you thereby cause or require such denotations to be >> absolute and unambiguous? Where do think denotations (or >> identifications) come from? > > The architecture is that each URI is owned. With HTTP URIs, this > happens through the domain name system and often delegation within a > domain. Unlike a word, a URI has an owner. The owner attempts to make > enough information available that the URI can be used by others without > ambiguity in practical situation. > But what about dbpedia.org? Who owns those URI? And that is one of the most exciting sites around. If "owned" at all, it is owned by a community that cooperatively decide the denotations of the URI. > For example, W3C owns http://www.w3.org/People/Berners-Lee/card#i and has > delegated to me the right to say what that URI stands for. To use it for > something else is an error. > >> In my opinion to denote (or to identify) is a verb, something that is >> done by the users of a symbol. After all, symbols (URI) are not agents, >> they don't wake up and choose to denote this or that. > > They have wonders which create them for a specific purpose. > >> Nor do I think denotation is an attribute or property of a symbol, >> somehow built in or attached when the symbol is first conceived. It is >> more like a dance. I use a symbol to denote something expecting you to >> interpret it to denote the same thing. And this coordination, this >> synchrony of interpretation by both sender and receiver, is not always >> easy. It requires real effort to sustain it. The minter of a URI cannot >> make it happen by declaration, nor can a research group or a standards >> body just decree it so. > > In many cases, the URI is defined by connection to already well- defined > sets of things. In other cases, such as the terms in the OWL ontology, > there was a huge amount of effort and discussion involved, and the > current term is supported by a lot of ongoing tutorials and so on. No > one said it was easy. But it is a different architecture from dance > associated with natural language words. > > It is different by design. The semantic web is an engineered system, not > an observation of nature. > >> The reason this matters is that since it requires this effort to create >> a denotation/identification in the first place, it is far more sensible, >> to me at least, to expect that the final disambiguation of a symbol be >> accomplished in the same way, by coordinated effort of the parties using >> the symbol, not by declaration of the W3C specifications that all URIs >> be absolutely unambiguous. >> This seems to me to be, as my grandfather used to say, a vain task. > > Your grandfather would perhaps have suggested that an attempt to define > the meaning of common words, as the Académie Française is set up to do > were a 'vain task'. Many would agree. But given that his water came to > him though pipes connected, possibly, by half-inch British Standard > pipe-thread connections, and he rode on rails set a certain distance > apart by some committee, and his TV came for better or worse in 525 or > 625 lines as decided by other committees, he may have respected that the > creation of standards is a very valuable function, and an essential to > progress. > > When people meet to define W3C specifications they are not doing it out > of vanity. They are performing coordinated effort of the parties who > would like to be able to use the symbol. They are, in general, users and > representatives of users of the symbol. They come together to allow > those who follow them to use it. They often work long hours, receiving > inadequate recognition for either products shipped or papers published, > the conventional metrics of performance, so I would not call it vanity. > > Note also that W3C (IETF, etc) specs have achieved a lot, made a lot of > interoperable systems, and formed with each layer a foundation for > building new layers. So I would not say that the work as been in vain > either. > Of course. I agree these organizations have achieved a lot, the people in them have worked hard, for little compensation, and specifications are good and necessary. What I meant to express was my belief that it is virtually impossible for anyone to prevent ambiguity in the URI created by the wide open public once these specifications have been released. And furthermore, that the real problem is that of creating a denotation for a URI at all. Once that problem is sovled, I think such issues as http-range-14, 303 redirects, etc. unnecessary. As such, I think they are a distraction. Here are a few examples of the kind of ambiguity that I am referring to. I assume for this purpose that it is in fact possible to create a denotation of a URI by publishing information about it at the location that will be returned through HTTP. A scenario: I make a statement at one point in time using your URI while the "information" about the URI says one thing. For simplicity, lets say I copy and cache this denoting/identifying information with my statements using it. Later you change the information at that URI and I make another statement using the same URI. If the denotation of a URI is given by the information retrievable from that URI, then it has become ambiguous. It denotes one thing as used in statements before the change, and another thing in statements used after the change. This would be ambiguity due to denotation-drift over time. Another scenario: A scientist discovers a new molecule. He names it with a URI that he owns. It is denoted by a descriptive account of some of the measurable properties that the scientist has discovered. Now another scientist becomes interested it in and discovers what he believes are additional properties. He wants to publish his results about the molecule. But the first scientist owns the URI, which has now become commonly used in the field, and doesn't want to add the new properties. Perhaps he thinks he has discovered properties which are incompatible and publishes those. So the second scientist publishes additional data about the molecule on his own web site using the original URI that everyone is using. Factions develop around each differing denotation. It is now ambiguous because the factions cannot agree on its denotation. Another scenario: If the denotation of a URI is given by the information that can be referenced when it is used in HTTP, then the URIs found there must be further denoted by the statements that can be referenced by those URI and so on. Now this process would likely be very computationally expensive in many cases and would likely be cut short in practical cases. If clients differed in the depth they computed the closure of a URI it would create ambiguity. One client has a reasoner that computes the ontological closure of a URI to level 100 and another client has one that computes the closure to level 200. Ambiguity arises here because the denotation/identification of one client's view of the URI is more detailed than the other. Another scenario: If a URI has embedded in it one or more natural language words, then the denotations of those words may affect the overall denotation of the URI. In the worse case, those natural language words may be ambiguous themselves. To prevent this you could forbid the use of natural language words in URI, but this would remove much of the actual semantics that exists on the semantic web. For as far as I can tell, we are so far mostly riding on the back of natural language to supply the denotations of URI. This is ambiguity due to the symbolic theft of natural language terms used in URI. See my post, http://kashori.com/2006/11/semantic-web-and-symbolic-theft.html for more on this idea. > Tim > > > > >
Received on Tuesday, 12 June 2007 20:53:49 UTC