Re: Clarifying what a URL identifies (Four Uses of a URL)

At some risk (since I'll claim to be arguing from
authority, too), I thought I would chime in.

I think the problem is the overly broad use of
'identifier', not the broad use of 'resource'.
My suggestion is to define the 'identifies' function
narrowly and add another function ('indicates') for
semantic web and denotational uses.

Progress was made on 'URI equivalence' by accepting that
there were multiple equivalence relationships, they
were different, that different applications might
have different needs, and that we would discuss the
different equivalence relationships.

To make progress on URI identification, accept that
there are multiple functions; say that the 'identifies'
function maps a URI to a Resource, and that there
are 'indicates' functions which map from a URI
to a 'concept'.  Since there are many (more than 4)
contexts, write 'indicates(context)' for the function
for a given context, i.e.:

   identifies(URI) -> Resource
   indicates(context) (URI)  -> Concept

Note that the range of 'indicates(context)' is
not a 'Resource', but something broader and even
less well defined.

In this formalism, you can say that
web browsers and most of the "working web" use
the 'identifies' function; it controls what happens
when you click on a link (the browser connects
to the resource identified), or chase down a
pointer in LDAP.

However, other applications ('XML namespace names',
'RDF assertions', 'inline in text') use a different
'indicates' function instead. Their range is
not necessarily a 'resource'. In this formalism,
a 'namespace' need not be a 'Resource'.

I don't know how many 'contexts' there are, but
surely there are more than four. Perhaps each
RDF assertion carries its own context, for that
matter.  While it is desirable that
  indicates(context) (URI) == indicates2 (context)(identifies(URI))
i.e., that what a URI indicates (for a given context)
depends only on the resource that the URI identifies,
this is not the case in many situations. For the
purpose of "indication", the URI is not opaque.
(I think this is a good definition of what it
means for a URI to be opaque, by the way: that
the application context does not depend on anything
about the URI other than the resource it identifies.)

The definition of a URI scheme should define the
'identifies' function; it cannot easily define
any 'indicates' functions. For 'http', the 'identifies'
function winds up being "whatever you connect to
by sending HTTP messages to the server designated by
the host:port of the URI, using the path of the URI,
at the time that the 'identifies' function is invoked
by an interpreter."

In this model, 'identifies' is construed narrowly.
But the range of 'indicates' can be quite broad.

The fact that a process might subsequently
link to Dan's car or the state of a light switch
is interesting but there is no need to try to
promote those things (which may or may not be a
Resource in the traditional sense) into resources.

Note also that the domain of the 'indicates' function
is broader than the domain of the 'identifies' function.
'identifies' accepts only absolute URIs, with no fragment
identifiers. 'indicates' also works with URI references
that have fragment identifiers.

RDF uses an 'indicates' function. When I use
http://www.w3.org to talk about the World Wide Web
Consortium or the web server or a web page at
a particular point in time -- in each case, this
is a different context for the 'indicates'
function.

Does this introduction of terminology help separate
out the different uses of URIs?

Larry
-- 
http://larry.masinter.net

Received on Friday, 24 January 2003 01:50:30 UTC