Re: RE : URI: Name or Network Location? from Patrick Stickler on 2004-01-26 (www-rdf-interest@w3.org from January 2004)

From: Patrick Stickler <patrick.stickler@nokia.com>
Date: Mon, 26 Jan 2004 15:45:52 +0200
To: <info@oilit.com>
Cc: <www-rdf-interest@w3.org>
Message-Id: <F549D318-5005-11D8-B6E6-000A95EAFCEA@nokia.com>
On Jan 26, 2004, at 13:52, ext Neil McNaughton wrote:

>
> I know that this is 'the original dumb question' but could someone 
> take a
> step back from this tantalizing thread and explain what is at issue 
> here -

(OK, you asked for it... ;-)

Well, the original thread was about whether an http: URI denotes
(names) an explicit location or address in the web information space,
or whether it names any arbitrary resource, and the location or
access of any representations of that resource may be determined
by the structure of that URI (however the web authority sees fit).

An offshoot thread was about whether it is necessary, or useful,
to have URIs via which, by design, it is not possible to locate
or access any representations of the resources they name.

In either case, to "dereference" or "resolve" a URI is to obtain
data in some manner which is associated with that URI. The URI is
like a key, and there is some function which, when given that key,
returns a value. The issue of the original thread was whether the
key itself names some location/address from/by which the value is
obtained (like the index of a linear array) or whether it has no
inherent relation to the location/address from/by which the value
is obtained (like the key of an associative array).

Per the first/original thread, I take the view that a URI denotes
(identifies) some arbitrary resource independent of any actual or
logical location or address in any information space, and that the
mapping from the identity of that resource to any representation or
instance of that resource is not inherently governed by the
lexical construction of that URI (even though, for certain
protocols, we may utilize and exploit a particular lexical
construction to aid in the resolution process).

(i.e., I take the latter "associative array" view)

Why this decoupling of the URI from how that URI might be used
to access representations of the named resource is important
(to me at least) is that it provides a consistent, shared
foundation via which the web and semantic web can interoperate,
and with a minimal number of explicit URIs.

Given that for any particular resource, we may in fact have to
deal with several other related resources in order to perform
certain operations -- i.e. a resource, its representation,
a physical location/access point for the representation -- we
have to ask ourselves which of those three is the most esential,
or basic for systems and applications to interoperate in terms
of that particular resource. The fact that most of the above
sentences focus on the resource should give a hint ;-)

It is (IMO) far more essential to keep explicit the identity of the
resource in question than the distinct identity of any of its
representations or any particular storage location/container
where those representations might reside. Generally, web applications
do not need to know the names of representations or of actual
locations in order to function, but can leave such information
implicit "behind the scenes" of each web server.

Thus, if we have a URI that denotes (names) some particular
resource, a browser can execute a GET request to obtain some
representation of the resource and display the results to a
user, and there is no need for the *browser* itself to know explicitly
which representation is being displayed or where it was located
in the web's global information space. Even if, using content
negotiation, the browser affects the selection of one particular
representation over another, it still doesn't need to know the
explicit identity of those representations or where/how they
were located/stored/generated/converted/etc. Now, somewhere, some
software needs to know the identities of the different representations
and where they reside (or how they are generated) but insofar as
the interaction between the browser and the web authority of
the URI, it is (usually) irrelevant.

If/when it is necessary to know the identity of particular
representations, that identity *should* be provided by the
web server in the response header -- so that, e.g., if one
wishes to make statements about that particular representation
(which is a resource in its own right) its identity is available.

And, because this "primary" URI being used to access
representations is the URI that denotes the resource
(not any representation or location), we can use that same
URI to describe the resource in question using semantic
web machinery. Thus, given a single URI, we can either obtain
representations of the resource, or reason about the resource.

Now, if the "primary" URI for web-accessible resources actually
named a logical location in the web information space, and that
location would contain a representation of a resource, then we'd
*have* to explicitly define and use two additional URIs, to denote
the representation and the resource of which it is a representation,
in order to unify the web and the semantic web.

(an alternative is for web-resolvable URIs treated as locations to
only contain digital resources, and preclude use of web-resolvable
URIs denoting non-digital resources -- which would be a tremendous
loss of utility, given the alternatives)

So, it's not really that treating http: URIs as web locations/addresses
*couldn't* work, or that such a treatment would preclude 
interoperability
between the traditional web and the semantic web, but that such a view
is much more cumbersome to work with, if we are to have a smooth and
efficient intersection of the web and semantic web.

Adopting the view that URIs name arbitrary resources (rather than
particular locations or addresses) and embracing the principle of URI
opacity (that the lexical nature of the URI does not inherently govern 
the
resolution of that URI by any server) provides a wonderful abstraction
that enables us to use the same URI to both describe that resource
as well as access representations/descriptions of that resource,
such that the majority of web and semantic web agents have no need
to worry about the explicit identity of particular representations
or locations unless we need/want to (yet in which case, the model
can be recursively applied as required).

I hope the above has not served to confuse you further ;-)

Patrick



>> -----Original Message-----
>> From: www-rdf-interest-request@w3.org [mailto:www-rdf-interest-
>> request@w3.org] On Behalf Of Hammond, Tony (ELSLON)
>> Sent: Monday, January 26, 2004 12:33 PM
>> To: 'Patrick Stickler'
>> Cc: www-rdf-interest@w3.org
>> Subject: RE: URI: Name or Network Location?
>>
>>
>>> I simply can't fathom any real benefit to having a URI
>>> which, by definition, cannot be used to access such knowledge.
>>
>> The reason is to keep the barrier to entry as low as possible. By
>> explicitly
>> excluding dereference we have devised a very simple, focussed 
>> registration
>> mechanism which requires almost zero maintenance and is consistent 
>> across
>> the whole INFO namespace with a predictable behaviour (i.e. 
>> disclosure of
>> identity). This is a baseline service - think of it as something like 
>> the
>> Model T.
>>
>> I agree that it would be useful to have resource representations 
>> sitting
>> out
>> there on some network endpoint - but that is just way too expensive 
>> for
>> the
>> namespaces we are interested in fostering. There are no (human) 
>> resources
>> available to maintain such an undertaking. The conclusion is that we
>> either
>> go this zero-resolution route or we accept that many of these 
>> namespaces
>> will continue not to be represented on the Web. Which means that we 
>> will
>> continue to be frustrated by not being able to 'talk' about well-known
>> public information assets in Web description technologies.
>>
>> Tony
>
>

--

Patrick Stickler
Nokia, Finland
patrick.stickler@nokia.com
Received on Monday, 26 January 2004 08:45:52 UTC