- From: Graham Klyne <GK@NineByNine.org>
- Date: Tue, 30 Jul 2002 12:01:00 +0100
- To: www-tag@w3.org
These are some observations; I'm currently open regarding the
conclusion. Roughly, my view is that there's no overwhelming technical
reason to force one kind of usage over another, but there may be practical
reasons to prefer a particular convention.
1. On specialization of URI schemes
There is a widely held view that, in general, a URI can identify
anything. We can't work out what a URI identifies by peeking at its scheme
identifier. I don't think this position is contradictory with the idea
that certain *specific* URI schemes are more limited in their scope of
identification. For example, the tel: URI scheme
[http://www.ietf.org/rfc/rfc2806.txt] is pretty clearly intended to be used
for identifying telephone terminals. One can argue that it's possible to
use a tel: URI to identify, say, a Unicorn called Ulysses, but I can't see
that is really helpful.
A URI scheme defines, among other things, a naming authority structure -
rules that determine who gets to allocate names and any constraints upon
such allocations. It seems quite reasonable to me to say that a given
scheme X has name allocation rules that have the effect of constraining the
kinds of things that can be named using X. For example, the tel: scheme
identifiers are clearly bound to numbers serviced by a telephone
network; the 'global' form of telephone number defers to the E.164
international standard telephone numbering plan for its naming authority.
So, I submit, the general principle of identification not being constrained
by URI scheme doesn't exclude that certain specific URI schemes may
restrict what is named. And conversely, the existence of URI schemes with
identification constraints doesn't weaken the principle that a scheme may,
in general, be used to identify anything. Can there be any reasonable
constraints on what uuid: or urn: may identify?
2. What can we say about http: URIs?
The naming authority is based on network retrieval.
In particular "The semantics are that the identified resource is located at
the server listening for TCP connections on that port of that host..."
[RFC2616].
[[
3.2.2 http URL
The "http" scheme is used to locate network resources via the HTTP
protocol. This section defines the scheme-specific syntax and
semantics for http URLs.
http_URL = "http:" "//" host [ ":" port ] [ abs_path [ "?" query ]]
If the port is empty or not given, port 80 is assumed. The semantics
are that the identified resource is located at the server listening
for TCP connections on that port of that host, and the Request-URI
for the resource is abs_path (section 5.1.2). ...
]]
Similar words are in RFC2068, which is cited in the IANA URI scheme
registry as the defining document for the http: scheme. These words seem
to support TimBL's view (which under the circumstances isn't an
overwhelming argument of itself, but suggestive).
3. What does a representation represent?
Much of this debate seems to depend on whether one can regard (say) a JPEG
picture of a car as being a representation of the car, or a representation
of a document describing a car.
It seems to me that either view is sustainable, and maybe can even coexist
(which is probably just as well because I expect folks will be doing both
for a while). For example, on my own web site I currently use URIs of the
form http://id.ninebynine.org/ to identify abstract concepts related to my
own experimental developments. I place documents at those URIs that are
intended to explain (more or less) what I intend the identifiers to denote
-- and at this time, I mean the URIs to denote abstract concepts, not the
documents. So how can I talk about the documents that describe the
identifiers? In my case, that's easy: as it happens, the URIs
http://www.ninebynine.org/ident/... retrieve exactly the same set of
documents. So a possibility here is that the first form of URI directly
reference the abstractions described by the web pages, and the second form
can directly identify the documents themselves. I'm not claiming this is a
Good Idea, just a possibility. And I suspect it's a possibility we have to
live with.
4. Questions
The questions I then ask myself are:
Would it be helpful for the community at large to have a preferred
convention for the interpretation of what http and similar URIs identify?
- I think that the answer is probably "yes" -- if only so we don't end up
repeating this debate over the next decade or so.
What approach is most helpful?
- I'm pretty agnostic, but I am leaning toward the idea that an HTTP URI
directly identifies a document, rather than what the document describes. I
think there are other ways to capture the indirect reference (e.g. fragment
IDs; one proposal is at
http://www.ninebynine.org/wip/RDF-basics/2002-07-29/Overview.htm#xtocid103660).
Does this need to be set in stone for the web to survive and grow?
- I hope not; I don't think so. I think there's already a diversity of
usage and we somehow need to accommodate that.
#g
-------------------
Graham Klyne
<GK@NineByNine.org>
Received on Tuesday, 30 July 2002 07:30:22 UTC