Re: Determining what a URI identifies

Sorry for the length, perhaps I'm repeating myself, but:

both
1) "The authority over a URI determines which resource it identifies."
and
2) "what a URI identifies is determined principally by *use*."

are flawed models, unworkable in practice. In model (1)
('the authority determines') you make the meaning of identification
dependent on some act ("determining") by some actor ("the authority")
where the nature of the act and the identity of the actor are
not defined. It means that what a URI identifies could change
if the 'authority' changes its mind, and users of the URI would
have no reliable of determining what the authority might or might
not have declared.

The second model (identification depends on use) is even worse
in this regard, because the nature of what constitutes a use
and how one determines the identification are undefined. If
uses are inconsistent, which use holds? Can a URI which has
a clear meaning lose its meaning if it is suddenly used widely
to mean something else?


To repeat again: it is hopeless to address the question of 
"what a URI identifies" as if it were a question of discovery,
something that we should set to find out, like the melting point
of some metal. We're don't need the answer to the question
"what do people mean when they utter a URI in natural language" 
(perhaps an interesting question, but irrelevant to the purpose),
what we need to establish the actual practice for is
"What do W3C specifications mean when they designate the use
 of a URI for an identifier".

This is a question that cannot be determined by asking everyone
you know what you think a URI means. This question can only be
determined by some architecture group coming to a good design.
Design (1) and Design (2) are bad designs.

A simpler design is:

What a URI determines is defined by the scheme. The definition
of a URI scheme must include a clear definition of what strings
that start with that scheme identify. URIs that start with
"http" identify resources that are accessed via the HTTP protocol,
using the simple meaning that
   http://host.example.org/path
identifies the network resource that one connects to speaking
the HTTP protocol to host "host.example.org" with path component
"/path".

That's simple. You can then go on to say -- given this
definition of identification -- that URIs are used in some
contexts to indicate something other than the resource that the
URI identifies. "Indication" is a different act than "Identification".
TimBL can use "http://www.w3.org/Consortium" to indicate the
W3C organization if he likes, and you can make up languages
in which that is true. But it doesn't change the nature of
what is actually identified; what's identified is the network
resource; and use varies.

What you learn by doing Google searches for links are how people
use URIs in HTML to indicate subjects, and has nothing to do
with what the URI identifies.  The only thing that 'use' determines
is, well,'use'. You can look at how people use http://www.w3.org
to see "how is http://www.w3.org used", but not what it means.
For that, you actually need to establish a clear and simple
design.

Larry
-- 
http://larry.masinter.net

Received on Monday, 4 November 2002 01:03:32 UTC