Re: Disambiguation; keeping the "U" in "URI"

Mark Baker writes:
> On Wed, Apr 24, 2002 at 04:43:06PM -0400, Sandro Hawke wrote:
> > Clicking on a link in a hypertext system means "tell me more about
> > this thing".  The "thing" is identified to the user in your example as
> > "IBM".  The web tells you more about things by fetching and displaying
> > web pages containing natural-language information about them.  Nowhere
> > in the system is the thing itself formally identified, just the place
> > where you can get some information.
> 
> As JonB responded to TimBL on www-tag, please tell me what it buys
> me to make this distinction.  Because I can't find any spec that
> tells me I can't do this; indeed, RFC 2396 says that anything with
> identity can be assigned a URI.

TimBL and I are talking about hypertext and HTTP URIs, not URIs in
general. 

 "mailto" URIs do not denote information repositories, they denote
mailboxes.  (I don't have a good answer for why what a hypertext
system should do with a mailto URI; I feel like I've gotten a mild
electric shock every time my browser pops up a compose-mail window
when I thought I was going to get some information about a person.)

The right RFC here is 2616 [1] which defines

 resource
      A network data object or service that can be identified by a URI,
      as defined in section 3.2. Resources may be available in multiple
      representations (e.g. multiple languages, data formats, size, and
      resolutions) or vary in other ways.

I don't much care for that definition; I find "information repository"
much more accurate than "network data object or service", but in any
case it certainly does not include people.

In HTTP, when you say "GET foo", it should be read as "GET the
contents currently stored in foo".  

> And also, as I mentioned with my
> IBM/Google example, people *do* make assertions about IBM the
> company by making assertions about http://www.ibm.com.

In your example, people made assertions about the company called "IBM"
with the address-for-more-information being http://www.ibm.com.  That
helps identify the company in the same way any additional information
helps disambiguate terms.

I was in a meeting this morning (of people very steeped in technical
Web culture) arguing about which URIs to use for certain publications,
and the language constistently used terms like "location".  You put a
document at a location, or at two locations.   You change whats at one
location, but not the other.   Etc.    

HTTP URIs are web addresses; they name abstract locations from which
information can be obtained, serialized into documents according to
content negotiation.

<... calming down ...>

While I think this is by far the most logical, consistent, and widely
held view, it doesn't work to well with
   <> dc:creator [whatever].
unless you read dc:creator to be talking about the content retreived
from the thing which is it's rdf subject, which is kind of a stretch.

What happens when the document at some location is replaced by one
with a different creator?  A whole raft of dc:creator triples become
incorrect.   Sigh.

I want to stop arguing about this, but I'm trying to write code today
which uses HTTP to learn things.  What can it learn about
<http://www.w3.org/> ?  It can learn that a GET operation it tried
returned several different byte streams under different circumstances;
each of those was a serialization in some language of some content.
Maybe some of those serialization included assertions about the author
of <http://www.w3.org/>, and others said <http://www.w3.org/> was born
in Geneva in 1991.  That's unpleasantly mind bending, but it
probably wont break anything.  It sure would be nice if we could all
be consistant about it, though.

       -- sandro

[1] http://ietf.org/rfc/rfc2616

Received on Thursday, 25 April 2002 12:03:50 UTC