Re: Determining what a URI identifies

This reminds me, I have missed in the arch. document an exposition and 
detailed clarification of URI opacity principles.  As Tim points out in 
his note, URI's are not in general as opaque as, say, UUIDs.  The very 
fact that RFC 2396 calls out not only schemes but hierarchies seems to be 
intended as more than a means of assigning the names...one assumes that at 
certain points in the chain the hierarchical name may, in fact, be 
navigated based on its structure (e.g. to find the resource during 
retrieval).  On the other hand, the general principle of opacity is surely 
important.

So, a suggestion:  add to the arch. document a clear exposition of what 
really is implied by the opacity principle, including more suggestive 
guidelines as to when it might or might not be appropriate to either use 
structural means to construct a URI (e.g. build it up a piece at a time as 
a user constructs a reference in a hierarchical space) or to base 
processing on its internal structure.  Thanks!

------------------------------------------------------------------
Noah Mendelsohn                              Voice: 1-617-693-4036
IBM Corporation                                Fax: 1-617-693-8676
One Rogers Street
Cambridge, MA 02142
------------------------------------------------------------------







Tim Berners-Lee <timbl@w3.org>
Sent by: www-tag-request@w3.org
12/11/2002 10:53 PM

 
        To:     Paul Prescod <paul@prescod.net>
        cc:     www-tag@w3.org, (bcc: Noah Mendelsohn/Cambridge/IBM)
        Subject:        Re: Determining what a URI identifies




On Monday, Nov 4, 2002, at 15:21 US/Eastern, Paul Prescod wrote:

> Tim Berners-Lee wrote:
>> ...
>> So when RDF talks about  <#joe> as having
>> a contact:mailbox of  <mailto:joe@example.com>
>> an RDF processor which is aware of the URI spec
>> and the spec of mailto:  knows that the object
>> is an email mailbox according to the email specs.
>
> You want the processor to infer the type of an object from the syntax 
> of its URI? What happened to opacity of URIs?
>

Good point!

Its a great general rule not to look inside a URI when you can
do what you can without doing so.

However, the opacity gradually reveals more and more information
as more and more specifications are used, one  after the other
in the chain of normative reference which starts with the URI spec.

The principle of opacity suggests that you do not
in an application put constraints (or interpretations)
on the stuff in a URI or you will limit the other specs it can be
used with and thus not leverage the whole power of the web.

Its important, for example, not to assume that anything whose
URI ends "html" is an HTML document - it may had an md5
URI - or some as yet uninvented URI scheme, or the server may
use the last few characters for something else.

So principle of opacity does not say "the characters in a URI
are quite arbitrary, put anything there."  You can't use
a mailto: URI to identify anything other than an RFC822 mailbox
because the specs say how mailboxes are identified by these
address@domain.things and how these can be used as
URIs with the "mailto:" prefix.

Tim

>  Paul Prescod

Received on Thursday, 12 December 2002 14:41:30 UTC