Re: A content-id URL scheme

Ed Levinson (elevinso@Accurate.COM)
Mon, 06 Feb 1995 11:17:03 -0500


Message-Id: <9502061617.AA10921@Accurate.COM>
To: "Roy T. Fielding" <fielding@avron.ics.uci.edu>
Cc: uri@bunyip.com
Subject: Re: A content-id URL scheme 
In-Reply-To: Your message of "Fri, 03 Feb 1995 23:38:14 PST."
             <9502032338.aa03153@paris.ics.uci.edu> 
Date: Mon, 06 Feb 1995 11:17:03 -0500
From: Ed Levinson <elevinso@Accurate.COM>

Roy,

Yeah, I like your suggestion on wording.  Does this work better?

     The Uniform Resource Locator (URL) scheme, "cid", allows
     individual entities within a multipart message body to
     refer to one another by their content-id labels.

I could see using msg-ids to refer to other messages would you like to
take that up?  CIDs would be real useful now, in my opinion, in the
mimesgml work and I don't see an application for msg-id.

> ... cid URLs should be capable of referencing
> any possible Content-ID.  Any characters that are not allowed in a URL can
> be escaped using the %hex encoding method.

I agree, using escapes make sense.  Why not use the general syntax from
1521/822 and have the following?

	cidurl   = "cid" ":" cid-spec
	cid-spec = addr-spec		; RFC 822, globally unique

I think that's what you have as the "correct BNF".

Finally, your comment on the Security section makes sense as well.

Thanks.../Ed

On Fri, 03 Feb 1995 23:38:14 PST "Roy T. Fielding" wrote:
> In draft-levinson-cid-00.txt, Ed writes:
> 
> >        Abstract
> > 
> >        The Uniform Resource Locator (URL) scheme, "cid", allows
> >        compound or aggregate objects in a multipart mail message to
> >        refer to one another by their body part labels.
> 
> It should also allow objects external to the multipart mail message to
> refer to body-parts of that multipart mail message.  In general, I think
> it would be better to use the MIME terms of "entity", "body", and "body-part"
> to refer to things inside a multipart message, instead of "object".
> I encountered this with the HTTP spec.
> 
> > ...
> >        A cid URL takes the form
> > 
> >                cidurl     = "cid" ":" cid-spec
> > 
> >        where cid-spec is a restricted form of "addr-spec" as
> >        defined in [RFC822].  The purpose of the restriction is to
> >        eliminate special characters from the cid URL.  Such
> >        characters can be problematical in many environments (e.g.,
> >        HTML and SGML) in which the cid URL may be used.  Cid URLs
> >        are a subset of MIME content-IDs and RFC822 message-IDs
> 
> I think this is backwards --> cid URLs should be capable of referencing
> any possible Content-ID.  Any characters that are not allowed in a URL can
> be escaped using the %hex encoding method.
> 
> Thus, the correct BNF would be
> 
>    ----------------------------------------------------------------------
>    cidurl      = "cid" ":" cid-spec
> 
>    cid-spec    =  local-part "@" domain        ; globally unique
> 
>    local-part  =  word *("." word)
>    word        =  atom | quoted-string
>    atom        =  1*<any okCHAR except specials, SPACE and CTLs>
> 
>    quoted-string = "%22" *(qtext|quoted-pair) "%22"
>    qtext       =  <any okCHAR excepting "%22",
>                    "%5C" & CR, and including
>                    linear-white-space>
>    quoted-pair =  "%5C" okCHAR
> 
>    domain      =  sub-domain *("." sub-domain)
>    sub-domain  =  domain-ref | domain-literal
>    domain-ref  =  atom
>    domain-literal =  "[" *(dtext | quoted-pair) "]"
>    dtext       =  <any okCHAR excluding "%5B",         ; => may be folded
>                    "%5D", "%5C" & "%0D", & including
>                    linear-white-space>
> 
>    specials    =  "(" | ")" | "%3C" | "%3E" | "%40"  ; Must be in quoted-
>                |  "," | ";" | ":" | "%5C" | "%22"    ;  string, to use
>                |  "." | "%5B" | "%5D"                ;  within a word.
>    linear-white-space =  1*([CRLF] LWSP-char)        ; semantics = SPACE
>                                                      ; CRLF => folding
>    CRLF        =  CR LF
>    LWSP-char   =  SPACE / HTAB                       ; semantics = SPACE
>    SPACE       =  "%20"
>    HTAB        =  "%09"
>    LF          =  "%0A"
>    CR          =  "%0D"
>    CTL         =  <any hex-escaped ASCII control     ; %00 - %1F
>                    character and DEL>                ; %7F
> 
> 
>    okCHAR      =  uchar | ";" | "/" | "?" | ":" | "&" | "="
>    uchar       =  <as defined in RFC 1738>
>    ----------------------------------------------------------------------
> 
> BUT, I think we could more easily live with just the superset:
> 
>    ----------------------------------------------------------------------
>    cidurl      = "cid" ":" cid-spec
> 
>    cid-spec    =  local-part "@" domain        ; globally unique
> 
>    local-part  =  1*cidchar
>    domain      =  1*cidchar
> 
>    cidchar     =  uchar | ";" | "/" | "?" | ":" | "&" | "="
>    uchar       =  <as defined in RFC 1738>
>    ----------------------------------------------------------------------
> 
> > ...
> >        3. Security
> > 
> >        Security issues are not addressed in this memorandum.
> 
> It would be more accurate to say that there are no security issues, since
> this section is meant to "address" the security issues.