- From: Bruce Lilly <blilly@erols.com>
- Date: Fri, 8 Apr 2005 14:55:19 -0400
- To: uri@w3.org
On Wed February 16 2005 10:20, Internet-Drafts@ietf.org wrote: > A New Internet-Draft is available from the on-line Internet-Drafts directories. > > > Title : The mailto URI scheme > Author(s) : M. Duerst, L. Masinter > Filename : draft-duerst-mailto-bis-00.txt > Pages : 13 > Date : 2005-2-15 Comments: The Abstract states "for designating electronic mail addresses", the section 1 text states "the Internet mailing address of an individual or service", section 3 says "an internet resource", and the reality seems to be specification of a prototype internet message (RFCs 822, 2822) as alluded to briefly in draft section 8. Claims regarding the purpose of a mailto URI should be consistent. Section 1 claims that "a previous version of the mailto URI scheme had severe limitations for non-ASCII characters", which is untrue; RFC 2047 mechanisms which (as amended by errata and RFC 2231) provide not only for non-ASCII text but also for language tagging as required by RFC 2277 for text. The UTF-8 scheme presented is claimed as "more straightforward and consistent internationalization", but it is not backwards compatible with existing implementations and fails to provide any mechanism for language tagging as required by BCP 18. When foisted upon existing mailto URI parsers, illegal message content will be generated, causing loss of interoperability due to the lack of backwards compatibility of that provision in the draft under discussion. Section 2 ABNF uses "urlc", which is not defined anywhere. Note that per http://www.ietf.org/ID-Checklist.html, all ABNF is supposed to be checked for such errors. The text implies that "mailbox" and "address" per RFC 2822 are equivalent, whereas they are defined quite differently in that RFC; moreover, the field body of an RFC 2822 To field is an address-list, which is not mentioned in the draft under discussion. Text states that "reserved" characters must be encoded, but does not give a list of "reserved" characters or a reference. RFC 3986 (listed as a normative reference, but not specifically mentioned w.r.t. "reserved") defines URI reserved characters as: reserved = gen-delims / sub-delims gen-delims = ":" / "/" / "?" / "#" / "[" / "]" / "@" sub-delims = "!" / "$" / "&" / "'" / "(" / ")" / "*" / "+" / "," / ";" / "=" The draft text specifically mentions "parentheses, comma, and the percent sign" as common in mailbox syntax; parentheses and comma are forbidden in a mailbox (they are RFC 822/2822 "specials"), percent is not "reserved" (but has other issues in URIs) and is rather uncommon in mailboxes, and the required '@' character which appears in every address-list is not mentioned and is not encoded in the examples. And square brackets ('[' and ']') are explicitly used in doamin literals which may be used in the domain of a mailbox. Colon appears in the RFC 822/2822 syntax of addresses which are named groups, and appear in the route portion of RFC 822 route-addrs. Forward slashes appear in X.400 derived mailboxes, and '!' can appear in local-parts (RFC 976). Finally, '<' and '>' are specials explicitly used in RFC 822/2822 angle-addrs (which may appear in mailboxes and addresses); while these are not "reserved", they may not appear (unencoded) in URIs. I believe that the '@' "reserved" character issue w.r.t. encodong has recently been discussed at length w.r.t. RFC 2368. Percent-encoding is recommended for non-ASCII octets, but that is incompatible with existing mailto URI-to-message prototype implementations, and will result in illegal and incompatible content in the resulting message prototypes. There is some wishy-washy wording about "wish to maximize interoperability"; the simple fact is that the proposed change is not backwards compatible, full stop. The topic is carried to ridiculous extremes by requiring developers to implement something which is nowhere defined (paragraph labeled "3." (especially see the last sentence in that paragraph). Non-standard terminology which is inconsistent with standard terminology as defined and used in normative references (esp. RFC 2822) appears in the draft (except, curiously, in the second paragraph of draft section 3, which does use standard terminology). E.g. instead of "header name", the standard term is "header field name" or "field name" (RFC 2822 section 2.2). The draft uses "body" in the same syntax as would be used for a header field name, but lacks any indication of how a generator or parser is supposed to differentiate message body from a header field named "Body", nor is there a message header field name registration template (BCP 90) reserving the header field name "Body". Message header field names are comprised of printing characters excluding colon, and can therefore include characters such as '?', '=', and '&'. The draft does not specifically discuss how those or "reserved" characters are to be handled when they appear within a header field name (as opposed to parts of a mailto URI intended to be part of a field body or message body). The draft seems to have a number of formatting/content anomalies: idnits reports: * The document seems to lack an RFC 3978 Section 5.1 IPR Disclosure Acknowledgement . * There are 52 instances of too long lines in the document, the longest one being 5 characters in excess of 72. [...] - Line 140 has weird spacing: '..." hname is...' - Line 161 has weird spacing: '...hvalues encod...' There are also 3 empty lines following the formfeed after the last page (nothing is supposed to follow that formfeed character). The examples at the end of section 2 do not meet syntax requirements; in particular the address-lists do not meet RFC 2822 syntax requirements as specified at the beginning of draft section 2 ("addr1", for example, is not a valid RFC 2822 address (or mailbox)). Specification revisions, such as those proposed in the draft under discussion, should ideally be designed in a backwards compatible fashion. When that is not possible, a "flag day" for universal change form the "old" to "new" format may be specified. Flag days are highly undesirable due to the disruption caused. The draft does something much worse; it requires a non-specific, poorly defined flag day: "once it is well deployed in software" (draft section 6). No mechanism is defined for determining precisely when that flag day is to take place. The examples in section 7.1 are bracketed with less-than and greater-than symbols, unlike the examples in earlier draft sections. The examples fail to percent-encode "reserved" characters as required by earlier provisions in the draft. Section 7.2 compounds inconsistency by returning to unbracketed examples. The first example in 7.2 will result in illegal content with existing, deployed mailto URI handlers. The second and third examples fail to percent-encode "reserved" characters. The fourth example will also result in illegal content with existing, deployed, mailto URI handlers; moreover, the draft implies that header fields which are NOT specified in the mailto URI are magically generated (Content-Type and Content-Transfer-Encoding fields are presented as having resulted from the example, but are nowhere specified in that example). It is unclear how the supposed determination of media type was made; for all I know, the content might have been intended by the mailto URI generator as describing a message body with media type image/png. The Subject field in the message prototype shows a charset specified, but the mailto URI specifies no such charset, and there is no indication of language. It is unclear how the Content-Transfer-Encoding field was created out of thin air, nor why quoted-printable (vs. base64) encoding was specified. The remaining examples in the section have similar issues. Draft section 8 contains the incomprehensible text "of what is will be sent". Section 8 also states that "MIME header[ field]s" are inappropriate, despite the fact that earlier examples use them (apparently generated from thin air). That text also mentions "Apparently-To", but there is no such message header field (RFC 4021). The same section mentions "SMTP 'Form' address", but it is unclear what that is supposed to mean (perhaps the SMTP envelope return path as specified as the SMTP MAIL FROM command argument, which is used for delivery notifications?). The last sentence of that section says "[RFC3490], and also apply". And what? The IANA Considerations section has no mention of registration of a message header field name "Body" (see above). There is no indication in the draft announcement, the draft heading, or in the draft Abstract of the intended status sought for this draft. The substantial changes proposed in the draft as currently written (viz. UTF-8 not encoded per RFCs 2047/2231 and errata) would preclude advancement to Draft status if they remain, but Draft status might be feasible w/o those incompatible changes (of course draft status would require a separate enumeration of at least two interoperable and independent implementations which fully conform with all provisions of the specification). Some issues reported regarding RFC 2368 remain unaddressed by the draft under discussion: The syntax permits some constructs corresponding to peculiar messages, e.g. a completely empty specification (save for "mailto:"), message body without any header fields. While it may be difficult or impractical to prevent some of that via ABNF, the normative text should probably warn against naive implementations that might generate invalid messages. Within mailto URLs, the characters "?", "=", "&" are reserved. As with URL reserved characters, there does not appear to be any technical requirement to reserve all three of those characters in all parts of a mailto URL. For example, neither "=" nor "&" should cause trouble in the "to" part of a mailto URL. Likewise "?" should be safe in "header".
Received on Friday, 8 April 2005 18:55:29 UTC