Re: URI Opacity Principle (was: Re: use of fragments as names is irresponsible)

Hi Noah.

FWIW - my working definition of URI opacity is:

RFC2396 specifies components of a URI. It also describes an algorithm for
combining relative URIs with base URIs, using information in the path
component. Likewise, the definition of various URI schemes and URI
fragments may specify processing models and additional syntax constraints
for some of those components (especially the authority and fragment
components).

Beyond this, the party that mints the URI (i.e., the authority, if the URI
has an authority component) may assign additional syntax constraints and
even semantics to components of the URI. They may or may not document
these elsewhere; for example, a HTML form may document how to
construct the query component of a particular URI.

To preserve the authority's ability to do this, specifications and other
third parties MUST NOT specify or require any particular additional
structure or syntactic constraint, or add any semantics, to URIs, UNLESS
they are specifying part of the generic dereferencing process.

Additionally, Web specfiications and other third parties (e.g., proxies)
SHOULD NOT infer additional metadata or other information from the
structure or syntax of a URI without explicit information to that effect
from party that minted the URI.



----- Original Message -----
From: <noah_mendelsohn@us.ibm.com>
To: "Sandro Hawke" <sandro@w3.org>
Cc: "Roy T. Fielding" <fielding@apache.org>; <www-tag@w3.org>
Sent: Tuesday, January 14, 2003 7:37 PM
Subject: URI Opacity Principle (was: Re: use of fragments as names is
irresponsible)


>
> Roy Fielding writes:
>
> >> Somewhere along the line the W3C got hooked on the
> >> notion that URIs are opaque and hierarchy is
> >> meaningless.  That is bogus, as evidenced
> >> by every decent information site on the web today.
>
> As I think I've suggested once or twice, the TAG would do the community
a
> service IMO if it would clarify the degree to which URI's are indeed to
be
> viewed as opaque.   When may their substructure be either inspected or
> built up incrementally, and when are they to be treated as "black
boxes"?
>
> Tim BL provides one exposition of the opacity principle at [1].  In
> general, I find that many correspondents on this list and others both
> oversimplify and confuse the issues, and one can make the case that the
> principle has not in fact been stated sufficiently clearly (or, per
Roy's
> note, perhaps it is at times a false goal).  Surely it is confusing to
> hear on the one hand that URIs are opaque, while on the other RFC 2396
[2]
> goes to some length to provide hierarchical substructure as a special
> case.  There is surely a sense in which:
>
>         uuid:f81d4fae-7dec-11d0-a765-00a0c91e6bf6
>
> is more inherently opaque (assuming the "-"s are truly just for
> readability) than:
>
>         http://example.org/root/sub1/sub2/mydoc.html
>
> If URI's were really intended to be opaque, why not make most every URI
a
> uuid?
>
> My impression is that the server implementing an HTTP resource should
> indeed have the ability to process based on the substructure of the URI.
> Surely it is appropriate for the server to map the HTTP example above to
> file system sub-directories should it choose to do so? (Though of
course,
> that's not required or visible from the outside.)  Is it OK for a client
> to help you build up the URI incrementally, one piece at a time?  IE
does.
> Is it OK to build a query string in a URI from a form?  Pretty surely,
and
> Tim says so.  Is it OK for a client history list to gather all the URI's
> that seem to be from example.org and group them?  Common practice
suggests
> "yes".  Can proxies cache based on the substructure of the URI?  I would
> think that's desirable (but don't try it with the uuid: scheme).  Is it
OK
> to start guessing MIME types of representations from that .html at the
> end?  True believers seem to say no (and I guess I'm one of them).  Is
it
> OK to assume that the .html URI above references a Web page as opposed
to
> some human being associated with a web page as opposed to some human
being
> who has nothing to do with a web page?  Seems to be the fodder for lots
of
> rambling on this list.  When is it appropriate for a client or other
agent
> to inspect the scheme as a means of determining a retrieval strategy?
I'm
> still somewhat confused as to what folks such as Roy think on this one,
> since we often hear that HTTP: need not identify resources to be
retrieved
> with HTTP (or maybe I've misunderstood).
>
> Anyway, I don't claim to have the answers, but it's very much the sort
of
> question I would expect the architecture document to help settle.  Just
> saying "URI's should be opaque" seems too simplistic, and thus
confusing.
> I'm glad Roy's note has brought it up.  Thank you!
>
> Noah
>
> [1] http://www.w3.org/DesignIssues/Axioms.html#opaque
> [2] http://www.ietf.org/rfc/rfc2396.txt
>
> ------------------------------------------------------------------
> Noah Mendelsohn                              Voice: 1-617-693-4036
> IBM Corporation                                Fax: 1-617-693-8676
> One Rogers Street
> Cambridge, MA 02142
> ------------------------------------------------------------------
>
>

Received on Wednesday, 15 January 2003 18:05:48 UTC