Re: Comments on draft-fielding-uri-rfc2396bis-07

On Tue November 2 2004 02:55, Roy T. Fielding wrote:

> Which is simply saying that the reserved character has the same
> meaning as the data character in mailto because '@' is not allowed
> as data in the mailto syntax (i.e., it is only allowed to be the
> reserved delimiter between mailbox name and mailbox host).
> 
> The fact that the generic syntax considers them to be two different
> URIs does not prevent the mailto scheme from declaring they are the
> same, since that is something scheme specs can do.

The current mailto URI specification (RFC 2368) does
not do so.

There are a couple of issues regarding the way URIs in
general are specified, the mailto URI specification, and
changes introduced in the subject draft.

One is the draft statement that URI producers should
encode reserved characters (such as the gen-delim '@')
vs. the same draft mailto URI example in which '@' has
not been encoded.  That appears to be an inconsistency
in the draft, probably easily remedied. 

Another slightly more subtle issue arises due to the change
in specification, multiple ways in which message address
fields can be represented in mailto URIs, and the way URIs
are parsed into (scheme/authority/path/query/fragment)
components.  A mailto URI can specify primary recipient
addresses via the initial mailbox list, explicitly via a
specified To field, or a combination of the two (RFC 2368
section 2). For example:

mailto:a%40b.com?to=c%40d.edu&to=e%40f.gov

corresponds to a message with a single header field

To: a@b.com, c@d.edu, e@f.gov

Parsing the URI yields a path component
   a%40b.com
and a query component
   to=c%40d.edu&to=e%40f.gov
(there is no authority or fragment).

RFC 2396 gives sets of reserved characters for the various
URI components, and it so happens that in RFC 2396, '@'
is reserved in the query component (RFC 2396 section 3.4)
but not in the path component (RFC 2396 section 3.3), so
the above URI could be written (RFC 2396 rules) as:

mailto:a@b.com?to=c%40d.edu&to=e%40f.gov

The subject draft, however, does not list reserved
characters for the individual URI components, and
reserves '@' generally, so it appears that the draft would
impose a change on mailto URIs, requiring '@' to be
encoded as "%40" even in the path component.
However, that is not reflected in the mailto URI examples
in the draft, nor is there any mention of the effect on
mailto URIs in draft Appendix D (changes from RFC 2396).
It's not at all clear whether that is in fact the intent or if
it merely results from the very different wording and
specification of "reserved" characters between RFC 2396
and the draft.

To further confuse matters, the draft explicitly specifies
'@' as a pchar alternative as used via segment and the
path-* productions in the path component AND via pchar
in the query component, implying that NO '@' need be
encoded even though the character is reserved via
gen-delims (and in conjunction with the statement that
all reserved characters should be encoded).

The bottom line is that it is not at all clear whether
producers of mailto URIs should or should not encode
'@' as "%40", and that lack of clarity to implementors
does not bode well for interoperability.

Received on Thursday, 4 November 2004 16:05:43 UTC