- From: Graham Klyne <GK@NineByNine.org>
- Date: Tue, 30 Jul 2002 12:01:00 +0100
- To: www-tag@w3.org
These are some observations; I'm currently open regarding the conclusion. Roughly, my view is that there's no overwhelming technical reason to force one kind of usage over another, but there may be practical reasons to prefer a particular convention. 1. On specialization of URI schemes There is a widely held view that, in general, a URI can identify anything. We can't work out what a URI identifies by peeking at its scheme identifier. I don't think this position is contradictory with the idea that certain *specific* URI schemes are more limited in their scope of identification. For example, the tel: URI scheme [http://www.ietf.org/rfc/rfc2806.txt] is pretty clearly intended to be used for identifying telephone terminals. One can argue that it's possible to use a tel: URI to identify, say, a Unicorn called Ulysses, but I can't see that is really helpful. A URI scheme defines, among other things, a naming authority structure - rules that determine who gets to allocate names and any constraints upon such allocations. It seems quite reasonable to me to say that a given scheme X has name allocation rules that have the effect of constraining the kinds of things that can be named using X. For example, the tel: scheme identifiers are clearly bound to numbers serviced by a telephone network; the 'global' form of telephone number defers to the E.164 international standard telephone numbering plan for its naming authority. So, I submit, the general principle of identification not being constrained by URI scheme doesn't exclude that certain specific URI schemes may restrict what is named. And conversely, the existence of URI schemes with identification constraints doesn't weaken the principle that a scheme may, in general, be used to identify anything. Can there be any reasonable constraints on what uuid: or urn: may identify? 2. What can we say about http: URIs? The naming authority is based on network retrieval. In particular "The semantics are that the identified resource is located at the server listening for TCP connections on that port of that host..." [RFC2616]. [[ 3.2.2 http URL The "http" scheme is used to locate network resources via the HTTP protocol. This section defines the scheme-specific syntax and semantics for http URLs. http_URL = "http:" "//" host [ ":" port ] [ abs_path [ "?" query ]] If the port is empty or not given, port 80 is assumed. The semantics are that the identified resource is located at the server listening for TCP connections on that port of that host, and the Request-URI for the resource is abs_path (section 5.1.2). ... ]] Similar words are in RFC2068, which is cited in the IANA URI scheme registry as the defining document for the http: scheme. These words seem to support TimBL's view (which under the circumstances isn't an overwhelming argument of itself, but suggestive). 3. What does a representation represent? Much of this debate seems to depend on whether one can regard (say) a JPEG picture of a car as being a representation of the car, or a representation of a document describing a car. It seems to me that either view is sustainable, and maybe can even coexist (which is probably just as well because I expect folks will be doing both for a while). For example, on my own web site I currently use URIs of the form http://id.ninebynine.org/ to identify abstract concepts related to my own experimental developments. I place documents at those URIs that are intended to explain (more or less) what I intend the identifiers to denote -- and at this time, I mean the URIs to denote abstract concepts, not the documents. So how can I talk about the documents that describe the identifiers? In my case, that's easy: as it happens, the URIs http://www.ninebynine.org/ident/... retrieve exactly the same set of documents. So a possibility here is that the first form of URI directly reference the abstractions described by the web pages, and the second form can directly identify the documents themselves. I'm not claiming this is a Good Idea, just a possibility. And I suspect it's a possibility we have to live with. 4. Questions The questions I then ask myself are: Would it be helpful for the community at large to have a preferred convention for the interpretation of what http and similar URIs identify? - I think that the answer is probably "yes" -- if only so we don't end up repeating this debate over the next decade or so. What approach is most helpful? - I'm pretty agnostic, but I am leaning toward the idea that an HTTP URI directly identifies a document, rather than what the document describes. I think there are other ways to capture the indirect reference (e.g. fragment IDs; one proposal is at http://www.ninebynine.org/wip/RDF-basics/2002-07-29/Overview.htm#xtocid103660). Does this need to be set in stone for the web to survive and grow? - I hope not; I don't think so. I think there's already a diversity of usage and we somehow need to accommodate that. #g ------------------- Graham Klyne <GK@NineByNine.org>
Received on Tuesday, 30 July 2002 07:30:22 UTC