- From: Martin Duerst <duerst@w3.org>
- Date: Mon, 14 Feb 2005 19:21:18 +0900
- To: uri@w3.org
Dear URI experts, I have just submitted the draft appended below to the Internet Drafts Editor. Here's the abstract for those that don't want to scroll: This document defines the format of Uniform Resource Identifiers (URI) for designating electronic mail addresses. The syntax of 'mailto' URIs from [RFC2368] is extended to be compatible with IRIs ([RFC3987]) for better internationalization. Comments welcome! Regards, Martin. P.S.: Just in case, this already works in at least one browser (Opera) ------------------------------------------------------------------------ Network Working Group M. Duerst Internet-Draft W3C/Keio University Obsoletes: 2368 (if approved) L. Masinter Expires: August 18, 2005 Adobe Systems Incorporated February 14, 2005 The mailto URI scheme draft-duerst-mailto-bis-00 Status of this Memo This document is an Internet-Draft and is subject to all provisions of Section 3 of RFC 3667. By submitting this Internet-Draft, each author represents that any applicable patent or other IPR claims of which he or she is aware have been or will be disclosed, and any of which he or she become aware will be disclosed, in accordance with RFC 3668. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet-Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt. The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. This Internet-Draft will expire on August 18, 2005. Copyright Notice Copyright (C) The Internet Society (2005). Abstract This document defines the format of Uniform Resource Identifiers (URI) for designating electronic mail addresses. The syntax of 'mailto' URIs from [RFC2368] is extended to be compatible with IRIs ([RFC3987]) for better internationalization. Duerst & Masinter Expires August 18, 2005 [Page 1] Internet-Draft The mailto URI scheme February 2005 Table of Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 2. Syntax of a mailto URL . . . . . . . . . . . . . . . . . . . 3 3. Semantics and Operations . . . . . . . . . . . . . . . . . . 5 4. Unsafe Headers . . . . . . . . . . . . . . . . . . . . . . . 5 5. Encoding . . . . . . . . . . . . . . . . . . . . . . . . . . 6 6. Deployment of UTF-8-Based Percent-Encoding . . . . . . . . . 6 7. Examples . . . . . . . . . . . . . . . . . . . . . . . . . . 6 7.1 Examples Conforming to RFC2368 . . . . . . . . . . . . . . 6 7.2 Examples Using UTF-8-Based Percent-Encoding . . . . . . . 8 8. Security Considerations . . . . . . . . . . . . . . . . . . 9 9. IANA Considerations . . . . . . . . . . . . . . . . . . . . 10 10. Changes from RFC 2368 . . . . . . . . . . . . . . . . . . . 11 11. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . 11 12. References . . . . . . . . . . . . . . . . . . . . . . . . . 11 12.1 Normative References . . . . . . . . . . . . . . . . . . 11 12.2 Informative References . . . . . . . . . . . . . . . . . 12 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . 12 Intellectual Property and Copyright Statements . . . . . . . 13 Duerst & Masinter Expires August 18, 2005 [Page 2] Internet-Draft The mailto URI scheme February 2005 1. Introduction The mailto URI scheme is used to designate the Internet mailing address of an individual or service. In its simplest form, a mailto URI contains an Internet mail address. For interaction with resources that requires message headers or message bodies to be specified, the mailto URI scheme also allows setting mail header fields and the message body. A previous version of the mailto URI scheme had severe limitations for non-ASCII characters. This document extends this to also allow character data to be percent-encoded based on UTF-8, as already seen in some implementations, for more straightforward and consistent internationalization. Please send comments on this document to the mailing list uri@w3.org. 2. Syntax of a mailto URL Following the syntax conventions of [STD66], and using the ABNF syntax defined in [RFC2234], a "mailto" URI has the form: mailtoURI = "mailto:" [ to ] [ headers ] to = [ mailbox *("%2C" mailbox ) ] headers = "?" header *( "&" header ) header = hname "=" hvalue hname = *urlc hvalue = *urlc "mailbox" is as specified in [RFC2822], i.e. it is a mail address, possibly including "phrase" and "comment" components. However, the following changes apply: 1. All characters that can appear in "mailbox" but are reserved or not allowed in URIs have to be percent-encoded. Examples are parentheses, commas, and the percent sign ("%"), which commonly occur in the "mailbox" syntax. 2. Percent-encoding can be used to denote non-ASCII characters in the part of a "mailbox" that denotes a domain name, in order to denote an internationalized domain name. The considerations for reg-name in [STD66] apply. In particular, non-ASCII characters must first be encoded according to UTF-8 [STD63], and then each octet of the corresponding UTF-8 sequence must be percent-encoded to be represented as URI characters. URI producing applications must not use percent-encoding in domain names unless it is used to represent a UTF-8 character sequence. When the internationalized domain name is used to compose a message, the Duerst & Masinter Expires August 18, 2005 [Page 3] Internet-Draft The mailto URI scheme February 2005 name must be transformed to the IDNA encoding [RFC3490]. URI producers should provide these domain names in the IDNA encoding, rather than percent-encoded, if they wish to maximize interoperability with legacy mailto: URI interpreters. 3. Percent-encoding in the LHS of an email address is reserved for potential future internationalization. Non-ASCII characters must first be encoded according to UTF-8 [STD63], and then each octet of the corresponding UTF-8 sequence must be percent-encoded to be represented as URI characters. Any other percent-encoding of non-ASCII characters is prohibited. When a LHS containing non-ASCII characters will be used to compose a message, the LHS must be transformed to conform to whatever encoding may be defined in a future specification for the internationalization of email addresses. "hname" and "hvalue" are encodings of an [RFC2822] header name and value, respectively. As with "to", all URI reserved characters must be encoded. The special hname "body" indicates that the associated hvalue is the body of the message. The "body" hname should contain the content for the first text/plain body part of the message. The "body" hname is primarily intended for generation of short text messages for automatic processing (such as "subscribe" messages for mailing lists), not general MIME bodies. Within mailto URIs, the characters "?", "=", "&" are reserved. Because the "&" (ampersand) character is reserved in HTML and XML, any mailto URI which contains an ampersand must be spelled differently in HTML and XML than in other contexts. A mailto URI which appears in an HTML or XML document must escape the "&", e.g. as "&". Non-ASCII characters can be encoded in hvalue as follows: 1. MIME encoded words (as defined in [RFC2047]) are permitted in header values, but not in an hvalue of a "body" hname. 2. Non-ASCII characters can be encoded according to UTF-8 [STD63], and then each octet of the corresponding UTF-8 sequence is percent-encoded to be represented as URI characters. When hvalues encoded in this way are used to compose a message, the hvalue must be transformed into MIME encoded words, except for an hvalue of a "body" hname, which has to be encoded according to [RFC2045]. Please note that for MIME encoded words and for bodies in composed email messages, encodings other than UTF-8 MAY Duerst & Masinter Expires August 18, 2005 [Page 4] Internet-Draft The mailto URI scheme February 2005 be used as long as the characters are properly transcoded. MIME encoded words and UTF-8-based percent-encoding SHOULD not both be used in the same hvalue. Also note that it is legal to specify both "to" and an "hname" whose value is "to". That is, mailto:addr1%2C%20addr2 is equivalent to mailto:?to=addr1%2C%20addr2 is equivalent to mailto:addr1?to=addr2 3. Semantics and Operations A mailto URI designates an "internet resource", which is the mailbox specified in the address. When additional headers are supplied, the resource designated is the same address, but with an additional profile for accessing the resource. While there are Internet resources that can only be accessed via electronic mail, the mailto URI is not intended as a way of retrieving such objects automatically. In current practice, resolving URIs such as those in the "http" scheme causes an immediate interaction between client software and a host running an interactive server. The "mailto" URI has unusual semantics because resolving such a URI does not cause an immediate interaction. Instead, the client creates a message to the designated address with the various header fields set as default. The user can edit the message, send this message unedited, or choose not to send the message. The operation of how any URI scheme is resolved is not mandated by the URI specifications. 4. Unsafe Headers The user agent interpreting a mailto URI SHOULD choose not to create a message if any of the headers are considered dangerous; it may also choose to create a message with only a subset of the headers given in the URI. Only the Subject, Keywords, and Body headers are believed to be both safe and useful. The creator of a mailto URI cannot expect the resolver of a URI to understand more than the "subject" and "body" headers. Clients that Duerst & Masinter Expires August 18, 2005 [Page 5] Internet-Draft The mailto URI scheme February 2005 resolve mailto URIs into mail messages should be able to correctly create [RFC2822]-compliant mail messages using the "subject" and "body" headers. 5. Encoding [STD66] requires that many characters in URIs be encoded. This affects the mailto scheme for some common characters that might appear in addresses, headers or message contents. One such character is space (" ", ASCII hex 20). Note the examples below that use "%20" for space in the message body. Also note that line breaks in the body of a message MUST be encoded with "%0D%0A". People creating mailto URIs must be careful to encode any reserved characters that are used in the URIs so that properly-written URI interpreters can read them. Also, client software that reads URIs must be careful to decode strings before creating the mail message so that the mail messages appear in a form that the recipient will understand. These strings should be decoded before showing the message to the user. The mailto URI scheme is limited in that it does not provide for substitution of variables. Thus, a message body that must include a user's email address can not be encoded using the mailto URI. This limitation also prevents mailto URIs that are signed with public keys and other such variable information. 6. Deployment of UTF-8-Based Percent-Encoding UTF-8-based percent-encoding should only be used in actual mailto URIs once it is well deployed in software that interprets mailto URIs (such as mail user agents). 7. Examples 7.1 Examples Conforming to RFC2368 URIs for an ordinary individual mailing address: <mailto:chris@example.com> A URI for a mail response system that requires the name of the file in the subject: <mailto:infobot@example.com?subject=current-issue> A mail response system that requires a "send" request in the body: Duerst & Masinter Expires August 18, 2005 [Page 6] Internet-Draft The mailto URI scheme February 2005 <mailto:infobot@example.com?body=send%20current-issue> A similar URI could have two lines with different "send" requests (in this case, "send current-issue" and, on the next line, "send index".) <mailto:infobot@example.com?body=send%20current-issue%0D%0Asend%20index> An interesting use of mailto URIs is when browsing archives of messages. Each browsed message might contain a mailto URI like: <mailto:foobar@example.com?In-Reply-To=%3C3469A91.D10AF4C@example.com%3E> A request to subscribe to a mailing list: <mailto:majordomo@example.com?body=subscribe%20bamboo-l> A URI for a single user which includes a CC of another user: <mailto:joe@example.com?cc=bob@example.com&body=hello> Another way of expressing the same thing: <mailto:?to=joe@example.com&cc=bob@example.com&body=hello> Note the use of the "&" reserved character, above. The following example, by using "?" twice, is incorrect: <mailto:joe@example.com?cc=bob@example.com?body=hello> ; WRONG! According to [RFC2822], the characters "?", "&", and even "%" may occur in addr-specs. The fact that they are reserved characters in this URI scheme is not a problem: those characters may appear in mailto URIs, they just may not appear in unencoded form. The standard URI encoding mechanisms ("%" followed by a two-digit hex number) must be used in these cases. To indicate the address "gorby%kremvax@example.com" one would do: <mailto:gorby%25kremvax@example.com> To indicate the address "unlikely?address@example.com", and include another header, one would do: <mailto:unlikely%3Faddress@example.com?blat=foop> As described above, the "&" (ampersand) character is reserved in HTML and must be replaced e.g. with "&". Thus, a complex URI that Duerst & Masinter Expires August 18, 2005 [Page 7] Internet-Draft The mailto URI scheme February 2005 has internal ampersands might look like: Click <a href="mailto:?to=joe@xyz.com&cc=bob@xyz.com&body=hello"> mailto:?to=joe@xyz.com&cc=bob@xyz.com&body=hello</a> to send a greeting message to Joe and Bob. 7.2 Examples Using UTF-8-Based Percent-Encoding Sending a mail with the subject "coffee" in French, i.e. "cafe" where the final e is an e-acute, using UTF-8 and percent-encoding: mailto:user@example.org?subject=caf%C3%A9 The same subject, this time using an encoded-word (escaping the "=" and "?" characters used in the encoded-word syntax, because they are reserved): mailto:user@example.org?subject=%3D%3Futf-8%3FQ%3Fcaf%3DC3%3DA9%3F%3D The same subject, this time encoded as iso-8859-1: mailto:user@example.org?subject=%3D%3Fiso-8859-1%3FQ%3Fcaf%3DE9%3F%3D Going back to straight UTF-8 and adding a body with the same value: mailto:user@example.org?subject=caf%C3%A9&body=caf%C3%A9 This mailto URI may result in a message looking like this: From: sender@example.net To: user@example.org Subject: =?utf-8?Q?caf=C3=A9?= Content-Type: text/plain;charset=utf-8 Content-Transfer-Encoding: quoted-printable caf=C3=A9 Duerst & Masinter Expires August 18, 2005 [Page 8] Internet-Draft The mailto URI scheme February 2005 The software sending the email is not restricted to UTF-8, but can use other encodings. The following shows the same email using iso-8859-1 two times: From: sender@example.net To: user@example.org Subject: =?iso-8859-1?Q?caf=E9?= Content-Type: text/plain;charset=iso-8859-1 Content-Transfer-Encoding: quoted-printable caf=E9 Different content transfer encodings (i.e. "8bit" or "base64" instead of "quoted-printable") and different encodings in encoded words (i.e. "B" instead of "Q") can also be used. For more examples of encoding the word coffee in different languages, see [RFC2324]. The following example uses the Japanese word "natto" (U+7D0D U+8C46) as a domain name label, sending a mail to a user at "natto".example.org: mailto:user@%E7%B4%8D%E8%B1%86.example.org?subject=Test&body=NATTO When constructing the email, the domain name label is converted to punycode. The resulting message may look as follows: From: sender@example.net To: user@xn--99zt52a.example.org Subject: Test Content-Type: text/plain Content-Transfer-Encoding: 7bit NATTO 8. Security Considerations The mailto scheme can be used to send a message from one user to another, and thus can introduce many security concerns. Mail messages can be logged at the originating site, the recipient site, and intermediary sites along the delivery path. If the messages are not encoded, they can also be read at any of those sites. A mailto URI gives a template for a message that can be sent by mail client software. The contents of that template may be opaque or difficult to read by the user at the time of specifying the URI. Duerst & Masinter Expires August 18, 2005 [Page 9] Internet-Draft The mailto URI scheme February 2005 Thus, a mail client should never send a message based on a mailto URI without first showing the user the full message that will be sent (including all headers that were specified by the mailto URI), fully decoded, and asking the user for approval to send the message as electronic mail. The mail client should also make it clear that the user is about to send an electronic mail message, since the user may not be aware that this is the result of a mailto URI. A mail client should never send anything without complete disclosure to the user of what is will be sent; it should disclose not only the message destination, but also any headers. Unrecognized headers, or headers with values inconsistent with those the mail client would normally send should be especially suspect. MIME headers (MIME- Version, Content-*) are most likely inappropriate, as are those relating to routing (From, Bcc, Apparently-To, etc.) Note that some headers are inherently unsafe to include in a message generated from a URI. For example, headers such as "From:", "Bcc:", and so on, should never be interpreted from a URI. In general, the fewer headers interpreted from the URI, the less likely it is that a sending agent will create an unsafe message. Examples of problems with sending unapproved mail include: mail that breaks laws upon delivery, such as making illegal threats; mail that identifies the sender as someone interested in breaking laws; mail that identifies the sender to an unwanted third party; mail that causes a financial charge to be incurred on the sender; mail that causes an action on the recipient machine that causes damage that might be attributed to the sender. Programs that interpret mailto URIs should ensure that the SMTP "From" address is set and correct. The security considerations of [STD66], [RFC3490], and also apply. [RFC3987] 9. IANA Considerations This document changes the definition of the mailto: URI scheme; the registry of URI schemes should refer to this document rather than its predecessor, [RFC2368]. Duerst & Masinter Expires August 18, 2005 [Page 10] Internet-Draft The mailto URI scheme February 2005 10. Changes from RFC 2368 For interoperability with IRIs ([RFC3987]), allowed percent-encoding, fixed to UTF-8, in the domain name part of an email address, in LHS part of an address (currently reserved because not operationally usable), and in hvalue parts. Changed from 'URL' to 'URI' Updated references: ABNF to [RFC2234]; message syntax to [RFC2822], URI Generic Syntax to [STD66] Expanded "#mailbox", because the "#" shortcut is no longer available; needs checking 11. Acknowledgments This document was derived from [RFC2368]; the acknowledgments from this specification still applies. In addition, we thank Paul Hoffman and Jamie Zawinsky for their work on [RFC2368]. Valuable input on this document was received from: Paul Hoffman. 12. References 12.1 Normative References [RFC2045] Freed, N. and N. Borenstein, "Multipurpose Internet Mail Extensions (MIME) Part One: Format of Internet Message Bodies", November 1996. [RFC2047] Moore, K., "MIME Part Three: Message Header Extensions for Non-ASCII Text", RFC 2047, November 1996. [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997. [RFC2234] Crocker, D. and P. Overell, "Augmented BNF for Syntax Specifications: ABNF", RFC 2234, November 1997. [RFC2822] Resnik, P., "Internet Message Format", RFC 2822, April 2001. [RFC3490] Faltstrom, P., Hoffman, P. and A. Costello, "Internationalizing Domain Names in Applications (IDNA)", RFC 3490, March 2003. Duerst & Masinter Expires August 18, 2005 [Page 11] Internet-Draft The mailto URI scheme February 2005 [RFC3491] Hoffman, P. and M. Blanchet, "Nameprep: A Stringprep Profile for Internationalized Domain Names (IDN)", RFC 3491, March 2003. [RFC3987] Duerst, M. and M. Suignard, "Internationalized Resource Identifiers (IRIs)", RFC 3987, January 2005. [STD63] Yergeau, F., "UTF-8, a transformation format of ISO 10646", STD 63, RFC 3629, November 2003. [STD66] Berners-Lee, T., Fielding, R. and L. Masinter, "Uniform Resource Identifier (URI): Generic Syntax", STD 66, RFC 3986, April 2004. 12.2 Informative References [RFC2324] Masinter, L., "Hyper Text Coffee Pot Control Protocol (HTCPCP/1.0)", RFC 2324, April 1998. [RFC2368] Hoffman, P., Masinter, L. and J. Zawinski, "The mailto URL scheme", RFC 2368, July 1998. Authors' Addresses Martin Duerst (Note: Please write "Duerst" with u-umlaut wherever possible, for example as "Dürst" in XML and HTML.) World Wide Web Consortium/Keio University 5322 Endo Fujisawa, Kanagawa 252-8520 Japan Phone: +81 466 49 1170 Fax: +81 466 49 1171 Email: mailto:duerst@w3.org URI: http://www.w3.org/People/D%C3%BCrst/ Larry Masinter Adobe Systems Incorporated 345 Park Ave San Jose, CA 95110 USA Phone: +1-408-536-3024 Email: LMM@acm.org URI: http://larry.masinter.net/ Duerst & Masinter Expires August 18, 2005 [Page 12] Internet-Draft The mailto URI scheme February 2005 Intellectual Property Statement The IETF takes no position regarding the validity or scope of any Intellectual Property Rights or other rights that might be claimed to pertain to the implementation or use of the technology described in this document or the extent to which any license under such rights might or might not be available; nor does it represent that it has made any independent effort to identify any such rights. Information on the procedures with respect to rights in RFC documents can be found in BCP 78 and BCP 79. Copies of IPR disclosures made to the IETF Secretariat and any assurances of licenses to be made available, or the result of an attempt made to obtain a general license or permission for the use of such proprietary rights by implementers or users of this specification can be obtained from the IETF on-line IPR repository at http://www.ietf.org/ipr. The IETF invites any interested party to bring to its attention any copyrights, patents or patent applications, or other proprietary rights that may cover technology that may be required to implement this standard. Please address the information to the IETF at ietf-ipr@ietf.org. Disclaimer of Validity This document and the information contained herein are provided on an "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Copyright Statement Copyright (C) The Internet Society (2005). This document is subject to the rights, licenses and restrictions contained in BCP 78, and except as set forth therein, the authors retain all their rights. Acknowledgment Funding for the RFC Editor function is currently provided by the Internet Society. Duerst & Masinter Expires August 18, 2005 [Page 13]
Received on Monday, 14 February 2005 11:59:00 UTC