Re: mid and cid URLs

Keith Moore (moore@cs.utk.edu)
Wed, 22 Nov 1995 15:38:46 -0500


Message-Id: <199511222038.PAA04211@wilma.cs.utk.edu>
From: Keith Moore <moore@cs.utk.edu>
To: asg@severn.wash.inmet.com (Al Gilman)
Cc: moore@cs.utk.edu (Keith Moore), ietf-types@uninett.no, uri@bunyip.com
Subject: Re: mid and cid URLs 
In-Reply-To: Your message of "Wed, 22 Nov 1995 10:23:14 EST."
             <9511221523.AA13116@severn.wash.inmet.com> 
Date: Wed, 22 Nov 1995 15:38:46 -0500

> You are ignoring incumbent capability in the host file system.  I
> don't know a web browser that doesn't support the file: access
> method.  And I do know of HTML users who use this method
> exclusively on their home PC which has no mail or HTTP while
> drafting Web Page bundles before uploading them to the server.
> 
> The tools you want to strike a deal with as helpers already know how
> to deal with sets of files in the local file system.  It is the least
> common denominator.  
>
> If the MIME tool will just get on with it and do file= disposition right,
> the helpers are already set up to deal with the product.  The version
> that involves extra work is the Content-ID scheme.

Ahem.

Perhaps you have forgotten that I was one of those who *proposed*
using the Content-Disposition header and relative URLs to handle
links from HTTP and other documents.

I am certainly *not* ignoring "incumbent capability in the host file
system".  On the other hand, I am also aware of the capabilities of
the installed base of MIME mail readers.



>   How did we get here?  From my vague recollection:
>   
>   + some people saw the need for intra-message references in MIME and
>     proposed a mechanism for it using content-ids and message/external-body
>     (thus involving minimal changes to MIME)
> 
> Someone else saw the need for inter-object references across archives
> and messages and created the URI language.

There is no "URI language" in practice.  URLs exist, and there's
general agreement about how they work.  URNs exist in some people's
minds, but there's not yet a shared understanding of what they're for
and how they work.  As for URIs, they're no better defined than
( URNs \/ URLs ), and probably somewhat fuzzier.

>   + someone else saw the need for a URL message/external-body access-type,
>     (thus bringing about greater intergration between email and the web)
>   
>   + someone noticed that if content-id URLs were defined, they could fit
>     into the URL access-type (similarly for message-ids)
>   
>   + other people noticed that some subset of {message-ids, content-ids, and 
>     article-ids from netnews} look the same and suggested that they all
>     be handled by the same syntax or mechanism.
> 
> This was a reduction to the Message-ID case of the concept of uniform
> identifiers for information resources, to be used across a wide variety
> of Internet archive and message modes.
>   
>   + eventually people started talking about how to make these look like
>     traditional URLs, or how to make URNs look like message-ids.
> 
> On the URI side, this was the realization that there were resource 
> identifiers in the mail and MIME usage that hadn't been integrated 
> into the Uniform syntax by dealing with file: ftp: mailto: Gopher: 
> and http: in the inital burst.

There's a reason for this.  For all of { file, ftp, mailto, gopher,
http, and even news }, there is a well-understood, largely
machine-independent, way to access a resource named by such a URL.
THIS IS NOT THE CASE for content-id or message-id.

In other words, the initial web architects were smart enough to tackle
a part of the problem that they could solve.

>   + others saw that these things are really a kind of URN, and 
>     proposed them as a URN scheme
>   
>   It might not seem so, but there's a large gap in implementation
>   difficulty between the first view of the world and the last ones.
>   
>   I'll now informally state Keith's Design Principle of URN Interoperability:
>   
>   	You should be able to type any kind of "URN" into any blank
>   	labeled "URN" and get a reasonable result.
>   
> You can actually broaden that to URI.

No I can't.  First of all, there's no widely accepted definition for
what a URI is (at least, no definition that's not dependent on other
undefined and poorly understood terms).  Second, it's very clear to me
that there are contexts in which a URL is appropriate but a URN is
not, or vice versa -- for much the same reasons as that IP addresses
are not freely interchangable with domain names.

Now if you're merely arguing that we should have blanks (perhaps
labeled "URI") which accept either URNs or URLs -- yes, I expect we
will have those someday.

>   But ... just because a content-id or a message-id has some of the
>   characteristics of a URN, doesn't mean we can derive much additional
>   benefit from calling it a URN.  The extra benefit from URNs over
>   message-ids will be from a resolution infrastructure, not from a
>   unified syntax.
>   
> At the level of the core <foo2i56hr4g@node..path> unique ID,
> whether you call it URx or URy is immaterial.

No it is not.  If we're developing a capability that allows references
between MIME body parts within a specific, narrow, well-defined
context (such as multipart/related), it's best not to confuse it with
a capability that allows objects to reference arbitrary other objects
by name.

> I think that when one considers decorating the unique ID with
> tips as to alternatives of access methods -- newsgroups, archives
> -- that the realization that this ID is not really an URL helps
> us to accept the different flavor of the composite citation with
> its attributes attached.

Perhaps it's best to think of it this way:

1. content-ids and message-ids aren't URLs.

(they aren't necessarily URNs, either...though they have some of the
desirable characteristics of URNs, and they might someday be
incorporated into a URN name space.)

2. notwithstanding #1, it may be useful to extend the URL notation to
allow references to content-ids and/or message-ids.

Keith