Re: mid and cid URLs

Al Gilman (asg@severn.wash.inmet.com)
Fri, 24 Nov 1995 17:06:04 -0500 (EST)


From: asg@severn.wash.inmet.com (Al Gilman)
Message-Id: <9511242206.AA29205@severn.wash.inmet.com>
Subject: Re: mid and cid URLs 
To: Harald.T.Alvestrand@uninett.no
Date: Fri, 24 Nov 1995 17:06:04 -0500 (EST)
Cc: connolly@beach.w3.org, asg@severn.wash.inmet.com, moore@cs.utk.edu,
In-Reply-To: <199511232058.VAA05032@dale.uninett.no> from "Harald.T.Alvestrand@uninett.no" at Nov 23, 95 09:58:43 pm

To follow up on what Harald.T.Alvestrand@uninett.no said ...
  Headers inside URLs???????????????????????
  Ron, where are your URCs when I need them????????

Harald,

Yes, I absolutely want to be able to touch RFC 822 header fields
from within URIs.  This is part of the implementation of a larger
goal.  The approach to this goal, including the URI syntax for
header fields, is a coordinated package of reforms in URI and
header usage to create a more integrated web across the Internet
transport modes.

MISSION GOAL:

The top-level capability sought is that resources identified with
FTP, Gopher, and HTTP transport, and those identified with SMTP
and NNTP transport, should be able to cite one another freely
without regard to the native transport mode of the resource
cited.

STRATEGY:

A bundle of low-impact upgrades to present URI and header usage
will accomplish this if we achieve the following:

     standard syntax for citing all manner of URIs in 822 headers
     such as "In-reply-to:" and "References:"

     standard syntax for citing mail and News objects which have
     message-ID and Content-ID values in contexts where an
     URL might otherwise go.

To web together the Internet corpus of resources, we need not one
but _two_ coordinated, interoperating syntactic forms.

OBJECTIVES:

The following specific capabilities are sought in near-term
pursuit of the above mission goal.

     mailto: URLs able to nominate but not dictate header values
          for resulting RFC 822 message

     mid: Citations in HTML and MIME headers able to quote existing
          resource header values for retrieval assistance

     common syntax for header-in-URI embedding for the above two.

     In-reply-to, References, etc. headers refined, not extended.
          is: *( msg_id | phrase )
          to_be: *( msg_id | cite | subject-phrase)
               ; where "cite" is the URI-embedding syntax

--*** POSSIBLE DETAILS ***-----------------------------------
While the syntax here has received some thought, it is not perfect
and is primarily offered as a vehicle to get to examples of what
we need to be able to do.  Alternate syntax with the same capability
would be fine.  But some semantic definitions are required first
to clarify the strategy.

DEFINITIONS:

Citations:

     The existing usage
          href="URL"
     in HTML text is a resource citation.

     The existing usage
          In-reply-to: <addr-spec>
     in an RFC 822 header is a resource citiation.

     The existing usage
          Location: "URL"
     in HTTP is a resource citation.

Resource citations are not comparable with resource locations.
Citations are _uses_ of resource identifiers; some of
these identifiers use identification-by-location i.e. URLs.

Uniform Resource Identifiers (URIs):

     A class of text expressions which includes URLs as defined
     in RFC 17xx, that are:
          used in resource citations (see above);
          syntactically distinguishable one from another.

The _definition_ of what is or is not a URI is not based on the
list of subtypes it currently admits.  That list presently only
contains URL schemes.  We are about to add a "mid:" scheme and
possibly a "cid:" scheme to this unified syntax for uniform use.
But it doesn't matter whether we care to call this new form a
URI, URN, or URC.  The question that determines "Is this an URI
or not" is the second-listed property above: that we can tell
them from URLs and from each other.  This _rule_ defines the URI
_class_ and not some list of recognized conforming subtypes.
This is the intent of RFC 1630.

URI FORMS:

mid scheme:
     msg-id = "<" addr-spec ">"         ; per RFC 822
     mid_URI = "mid:" msg-cite *( "/" msg-cite )
               "?" content-cite
               *( ";" header-equation)
     msg-cite = addr-spec               ; as used in News: URL
     content-cite = msg-cite            ; internally similar
     header-equation = field-name "=" field-body
     field-name, field-body             ; per RFC 822

Notes:
     a citation of a Content-ID value is contextually
     distinguishable from a citation of a Message-ID value
     because the content-cite always comes after the
     "?" punctuation mark, and a msg-cite never does.

mailto scheme:
     mailto-URI = "mailto:" mailbox               ; as now
                    *( ";" header-equation )      ; growth
                    *( "/" body-line )            ; growth
     mailbox = addr-spec                ; per RFC 822

Examples:

simple reference to a NewsPost:
mid:199511241230.HAA18030@list.cren.net

more complicated reference to the same NewsPost:

mid:199511241230.HAA18030@list.cren.net;subject="What%20a%20body%20part%20is";Newsgroups=info.ietf.smtp

simple reference to a MIME part without specifying enclosing
message:

mid:?AA42R2HY45J4UOI@dale.uninett.no

Reference to a widely available FAQ:

mid:internet-services%2Faccess-via-email_814453424@rtfm.mit.edu;
  newsgroups="alt.internet.services,alt.online-service,
  alt.bbs.internet,alt.answers,comp.mail.misc,comp.answers,
  news.newusers.questions,news.answers";
  Location="ftp://rtfm.mit.edu/pub/usenet
  /internet-services/access-via-email,
  mailto:listserv@ubvm.cc.buffalo.edu/"GET%20INTERNET%20
  BY-EMAIL%20NETTRAIN%20F=MAIL"

simple mail opportunity:
<a href="mailto:Harald.T.Alvestrand@uninett.no">

more complicated mail opportunity:
<a href="mailto:Harald.T.Alvestrand@uninett.no;subject=Re:%20What%20a%20body%20part%20is;references=%22mid:9511201444.aa22175@paris.ics.uci.edu%3Bfrom=Roy.T.Fielding%22">

HEADER FORMS:

field     = field-name ":" [ field-body ] CRLF    ; per RFC 822

field-name = [field-class-modifier "-"] field-class-name
                                                  ; new refinement
Examples:
     "ID" is_a field-class-name
     "Message" is a field-class-modifier
     "Content" is a field-class-modifier
     "Message-ID" is a field-name
     "Content-ID" is a field-name
Note: the syntax for arguments belongs to the root "ID:" class
     "From:" is a field-class-name
     "Resent" is a field-class-modifier
     "Resent-From:" is a field-name

citation-superclass
     = ID-field | References-field | Referrer-field
          | In-reply-to-field | ...               ; extendable

For fields in the citation superclass, the contents are
limited to citation field values as follows:

citation-field-value
     = *( (msg-id | cite | subject-phrase) LWSP)

msg-id                                       ; per RFC 822 as is
cite = [URI-class ":"] URI                   ; can elide "URL:" prefix
subject-phrase = phrase                 ; per RFC 822 syntax
                                        ; bears connotation that match to
                                        ; subject field content takes
                                        ; precedence over other matches
URI-class = "URI" | "URL" | "URN" | ...      ; extendable
                                             ; case-insensitive

URI       = URL | mid-URI | ...              ; extendable  

Note that the "References:" header in RFC 822 is _not_ restricted to 
msg-id codes at present, and in fact for mail coming out of a VMS
environment, tools such as Hypermail do recongnize subject threading
as well as msg-id threading.  So the change on the 822 side is miniscule.

The URI vocabulary is known to need a "mid:" scheme.

Changes like these will complete a real Uniform Resource Identification
practice that will further energize the Internet.

Al Gilman