- From: Martin Duerst <duerst@w3.org>
- Date: Mon, 14 Feb 2005 19:21:18 +0900
- To: uri@w3.org
Dear URI experts,
I have just submitted the draft appended below to the Internet
Drafts Editor. Here's the abstract for those that don't want
to scroll:
This document defines the format of Uniform Resource Identifiers
(URI) for designating electronic mail addresses. The syntax of
'mailto' URIs from [RFC2368] is extended to be compatible with IRIs
([RFC3987]) for better internationalization.
Comments welcome!
Regards, Martin.
P.S.: Just in case, this already works in at least one browser (Opera)
------------------------------------------------------------------------
Network Working Group M. Duerst
Internet-Draft W3C/Keio University
Obsoletes: 2368 (if approved) L. Masinter
Expires: August 18, 2005 Adobe Systems Incorporated
February 14, 2005
The mailto URI scheme
draft-duerst-mailto-bis-00
Status of this Memo
This document is an Internet-Draft and is subject to all provisions
of Section 3 of RFC 3667. By submitting this Internet-Draft, each
author represents that any applicable patent or other IPR claims of
which he or she is aware have been or will be disclosed, and any of
which he or she become aware will be disclosed, in accordance with
RFC 3668.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF), its areas, and its working groups. Note that
other groups may also distribute working documents as
Internet-Drafts.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
The list of current Internet-Drafts can be accessed at
http://www.ietf.org/ietf/1id-abstracts.txt.
The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html.
This Internet-Draft will expire on August 18, 2005.
Copyright Notice
Copyright (C) The Internet Society (2005).
Abstract
This document defines the format of Uniform Resource Identifiers
(URI) for designating electronic mail addresses. The syntax of
'mailto' URIs from [RFC2368] is extended to be compatible with IRIs
([RFC3987]) for better internationalization.
Duerst & Masinter Expires August 18, 2005 [Page 1]
Internet-Draft The mailto URI scheme February 2005
Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3
2. Syntax of a mailto URL . . . . . . . . . . . . . . . . . . . 3
3. Semantics and Operations . . . . . . . . . . . . . . . . . . 5
4. Unsafe Headers . . . . . . . . . . . . . . . . . . . . . . . 5
5. Encoding . . . . . . . . . . . . . . . . . . . . . . . . . . 6
6. Deployment of UTF-8-Based Percent-Encoding . . . . . . . . . 6
7. Examples . . . . . . . . . . . . . . . . . . . . . . . . . . 6
7.1 Examples Conforming to RFC2368 . . . . . . . . . . . . . . 6
7.2 Examples Using UTF-8-Based Percent-Encoding . . . . . . . 8
8. Security Considerations . . . . . . . . . . . . . . . . . . 9
9. IANA Considerations . . . . . . . . . . . . . . . . . . . . 10
10. Changes from RFC 2368 . . . . . . . . . . . . . . . . . . . 11
11. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . 11
12. References . . . . . . . . . . . . . . . . . . . . . . . . . 11
12.1 Normative References . . . . . . . . . . . . . . . . . . 11
12.2 Informative References . . . . . . . . . . . . . . . . . 12
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . 12
Intellectual Property and Copyright Statements . . . . . . . 13
Duerst & Masinter Expires August 18, 2005 [Page 2]
Internet-Draft The mailto URI scheme February 2005
1. Introduction
The mailto URI scheme is used to designate the Internet mailing
address of an individual or service. In its simplest form, a mailto
URI contains an Internet mail address. For interaction with
resources that requires message headers or message bodies to be
specified, the mailto URI scheme also allows setting mail header
fields and the message body.
A previous version of the mailto URI scheme had severe limitations
for non-ASCII characters. This document extends this to also allow
character data to be percent-encoded based on UTF-8, as already seen
in some implementations, for more straightforward and consistent
internationalization.
Please send comments on this document to the mailing list uri@w3.org.
2. Syntax of a mailto URL
Following the syntax conventions of [STD66], and using the ABNF
syntax defined in [RFC2234], a "mailto" URI has the form:
mailtoURI = "mailto:" [ to ] [ headers ]
to = [ mailbox *("%2C" mailbox ) ]
headers = "?" header *( "&" header )
header = hname "=" hvalue
hname = *urlc
hvalue = *urlc
"mailbox" is as specified in [RFC2822], i.e. it is a mail address,
possibly including "phrase" and "comment" components. However, the
following changes apply:
1. All characters that can appear in "mailbox" but are reserved or
not allowed in URIs have to be percent-encoded. Examples are
parentheses, commas, and the percent sign ("%"), which commonly
occur in the "mailbox" syntax.
2. Percent-encoding can be used to denote non-ASCII characters in
the part of a "mailbox" that denotes a domain name, in order to
denote an internationalized domain name. The considerations for
reg-name in [STD66] apply. In particular, non-ASCII characters
must first be encoded according to UTF-8 [STD63], and then each
octet of the corresponding UTF-8 sequence must be percent-encoded
to be represented as URI characters. URI producing applications
must not use percent-encoding in domain names unless it is used
to represent a UTF-8 character sequence. When the
internationalized domain name is used to compose a message, the
Duerst & Masinter Expires August 18, 2005 [Page 3]
Internet-Draft The mailto URI scheme February 2005
name must be transformed to the IDNA encoding [RFC3490]. URI
producers should provide these domain names in the IDNA encoding,
rather than percent-encoded, if they wish to maximize
interoperability with legacy mailto: URI interpreters.
3. Percent-encoding in the LHS of an email address is reserved for
potential future internationalization. Non-ASCII characters must
first be encoded according to UTF-8 [STD63], and then each octet
of the corresponding UTF-8 sequence must be percent-encoded to be
represented as URI characters. Any other percent-encoding of
non-ASCII characters is prohibited. When a LHS containing
non-ASCII characters will be used to compose a message, the LHS
must be transformed to conform to whatever encoding may be
defined in a future specification for the internationalization of
email addresses.
"hname" and "hvalue" are encodings of an [RFC2822] header name and
value, respectively. As with "to", all URI reserved characters must
be encoded.
The special hname "body" indicates that the associated hvalue is the
body of the message. The "body" hname should contain the content for
the first text/plain body part of the message. The "body" hname is
primarily intended for generation of short text messages for
automatic processing (such as "subscribe" messages for mailing
lists), not general MIME bodies.
Within mailto URIs, the characters "?", "=", "&" are reserved.
Because the "&" (ampersand) character is reserved in HTML and XML,
any mailto URI which contains an ampersand must be spelled
differently in HTML and XML than in other contexts. A mailto URI
which appears in an HTML or XML document must escape the "&", e.g.
as "&".
Non-ASCII characters can be encoded in hvalue as follows:
1. MIME encoded words (as defined in [RFC2047]) are permitted in
header values, but not in an hvalue of a "body" hname.
2. Non-ASCII characters can be encoded according to UTF-8 [STD63],
and then each octet of the corresponding UTF-8 sequence is
percent-encoded to be represented as URI characters. When
hvalues encoded in this way are used to compose a message, the
hvalue must be transformed into MIME encoded words, except for an
hvalue of a "body" hname, which has to be encoded according to
[RFC2045]. Please note that for MIME encoded words and for
bodies in composed email messages, encodings other than UTF-8 MAY
Duerst & Masinter Expires August 18, 2005 [Page 4]
Internet-Draft The mailto URI scheme February 2005
be used as long as the characters are properly transcoded.
MIME encoded words and UTF-8-based percent-encoding SHOULD not both
be used in the same hvalue.
Also note that it is legal to specify both "to" and an "hname" whose
value is "to". That is,
mailto:addr1%2C%20addr2
is equivalent to
mailto:?to=addr1%2C%20addr2
is equivalent to
mailto:addr1?to=addr2
3. Semantics and Operations
A mailto URI designates an "internet resource", which is the mailbox
specified in the address. When additional headers are supplied, the
resource designated is the same address, but with an additional
profile for accessing the resource. While there are Internet
resources that can only be accessed via electronic mail, the mailto
URI is not intended as a way of retrieving such objects
automatically.
In current practice, resolving URIs such as those in the "http"
scheme causes an immediate interaction between client software and a
host running an interactive server. The "mailto" URI has unusual
semantics because resolving such a URI does not cause an immediate
interaction. Instead, the client creates a message to the designated
address with the various header fields set as default. The user can
edit the message, send this message unedited, or choose not to send
the message. The operation of how any URI scheme is resolved is not
mandated by the URI specifications.
4. Unsafe Headers
The user agent interpreting a mailto URI SHOULD choose not to create
a message if any of the headers are considered dangerous; it may also
choose to create a message with only a subset of the headers given in
the URI. Only the Subject, Keywords, and Body headers are believed
to be both safe and useful.
The creator of a mailto URI cannot expect the resolver of a URI to
understand more than the "subject" and "body" headers. Clients that
Duerst & Masinter Expires August 18, 2005 [Page 5]
Internet-Draft The mailto URI scheme February 2005
resolve mailto URIs into mail messages should be able to correctly
create [RFC2822]-compliant mail messages using the "subject" and
"body" headers.
5. Encoding
[STD66] requires that many characters in URIs be encoded. This
affects the mailto scheme for some common characters that might
appear in addresses, headers or message contents. One such character
is space (" ", ASCII hex 20). Note the examples below that use "%20"
for space in the message body. Also note that line breaks in the
body of a message MUST be encoded with "%0D%0A".
People creating mailto URIs must be careful to encode any reserved
characters that are used in the URIs so that properly-written URI
interpreters can read them. Also, client software that reads URIs
must be careful to decode strings before creating the mail message so
that the mail messages appear in a form that the recipient will
understand. These strings should be decoded before showing the
message to the user.
The mailto URI scheme is limited in that it does not provide for
substitution of variables. Thus, a message body that must include a
user's email address can not be encoded using the mailto URI. This
limitation also prevents mailto URIs that are signed with public keys
and other such variable information.
6. Deployment of UTF-8-Based Percent-Encoding
UTF-8-based percent-encoding should only be used in actual mailto
URIs once it is well deployed in software that interprets mailto URIs
(such as mail user agents).
7. Examples
7.1 Examples Conforming to RFC2368
URIs for an ordinary individual mailing address:
<mailto:chris@example.com>
A URI for a mail response system that requires the name of the file
in the subject:
<mailto:infobot@example.com?subject=current-issue>
A mail response system that requires a "send" request in the body:
Duerst & Masinter Expires August 18, 2005 [Page 6]
Internet-Draft The mailto URI scheme February 2005
<mailto:infobot@example.com?body=send%20current-issue>
A similar URI could have two lines with different "send" requests (in
this case, "send current-issue" and, on the next line, "send index".)
<mailto:infobot@example.com?body=send%20current-issue%0D%0Asend%20index>
An interesting use of mailto URIs is when browsing archives of
messages. Each browsed message might contain a mailto URI like:
<mailto:foobar@example.com?In-Reply-To=%3C3469A91.D10AF4C@example.com%3E>
A request to subscribe to a mailing list:
<mailto:majordomo@example.com?body=subscribe%20bamboo-l>
A URI for a single user which includes a CC of another user:
<mailto:joe@example.com?cc=bob@example.com&body=hello>
Another way of expressing the same thing:
<mailto:?to=joe@example.com&cc=bob@example.com&body=hello>
Note the use of the "&" reserved character, above. The following
example, by using "?" twice, is incorrect:
<mailto:joe@example.com?cc=bob@example.com?body=hello> ; WRONG!
According to [RFC2822], the characters "?", "&", and even "%" may
occur in addr-specs. The fact that they are reserved characters in
this URI scheme is not a problem: those characters may appear in
mailto URIs, they just may not appear in unencoded form. The
standard URI encoding mechanisms ("%" followed by a two-digit hex
number) must be used in these cases.
To indicate the address "gorby%kremvax@example.com" one would do:
<mailto:gorby%25kremvax@example.com>
To indicate the address "unlikely?address@example.com", and include
another header, one would do:
<mailto:unlikely%3Faddress@example.com?blat=foop>
As described above, the "&" (ampersand) character is reserved in HTML
and must be replaced e.g. with "&". Thus, a complex URI that
Duerst & Masinter Expires August 18, 2005 [Page 7]
Internet-Draft The mailto URI scheme February 2005
has internal ampersands might look like:
Click <a
href="mailto:?to=joe@xyz.com&cc=bob@xyz.com&body=hello">
mailto:?to=joe@xyz.com&cc=bob@xyz.com&body=hello</a> to send
a greeting message to Joe and Bob.
7.2 Examples Using UTF-8-Based Percent-Encoding
Sending a mail with the subject "coffee" in French, i.e. "cafe"
where the final e is an e-acute, using UTF-8 and percent-encoding:
mailto:user@example.org?subject=caf%C3%A9
The same subject, this time using an encoded-word (escaping the "="
and "?" characters used in the encoded-word syntax, because they are
reserved):
mailto:user@example.org?subject=%3D%3Futf-8%3FQ%3Fcaf%3DC3%3DA9%3F%3D
The same subject, this time encoded as iso-8859-1:
mailto:user@example.org?subject=%3D%3Fiso-8859-1%3FQ%3Fcaf%3DE9%3F%3D
Going back to straight UTF-8 and adding a body with the same value:
mailto:user@example.org?subject=caf%C3%A9&body=caf%C3%A9
This mailto URI may result in a message looking like this:
From: sender@example.net
To: user@example.org
Subject: =?utf-8?Q?caf=C3=A9?=
Content-Type: text/plain;charset=utf-8
Content-Transfer-Encoding: quoted-printable
caf=C3=A9
Duerst & Masinter Expires August 18, 2005 [Page 8]
Internet-Draft The mailto URI scheme February 2005
The software sending the email is not restricted to UTF-8, but can
use other encodings. The following shows the same email using
iso-8859-1 two times:
From: sender@example.net
To: user@example.org
Subject: =?iso-8859-1?Q?caf=E9?=
Content-Type: text/plain;charset=iso-8859-1
Content-Transfer-Encoding: quoted-printable
caf=E9
Different content transfer encodings (i.e. "8bit" or "base64"
instead of "quoted-printable") and different encodings in encoded
words (i.e. "B" instead of "Q") can also be used.
For more examples of encoding the word coffee in different languages,
see [RFC2324].
The following example uses the Japanese word "natto" (U+7D0D U+8C46)
as a domain name label, sending a mail to a user at
"natto".example.org:
mailto:user@%E7%B4%8D%E8%B1%86.example.org?subject=Test&body=NATTO
When constructing the email, the domain name label is converted to
punycode. The resulting message may look as follows:
From: sender@example.net
To: user@xn--99zt52a.example.org
Subject: Test
Content-Type: text/plain
Content-Transfer-Encoding: 7bit
NATTO
8. Security Considerations
The mailto scheme can be used to send a message from one user to
another, and thus can introduce many security concerns. Mail
messages can be logged at the originating site, the recipient site,
and intermediary sites along the delivery path. If the messages are
not encoded, they can also be read at any of those sites.
A mailto URI gives a template for a message that can be sent by mail
client software. The contents of that template may be opaque or
difficult to read by the user at the time of specifying the URI.
Duerst & Masinter Expires August 18, 2005 [Page 9]
Internet-Draft The mailto URI scheme February 2005
Thus, a mail client should never send a message based on a mailto URI
without first showing the user the full message that will be sent
(including all headers that were specified by the mailto URI), fully
decoded, and asking the user for approval to send the message as
electronic mail. The mail client should also make it clear that the
user is about to send an electronic mail message, since the user may
not be aware that this is the result of a mailto URI.
A mail client should never send anything without complete disclosure
to the user of what is will be sent; it should disclose not only the
message destination, but also any headers. Unrecognized headers, or
headers with values inconsistent with those the mail client would
normally send should be especially suspect. MIME headers (MIME-
Version, Content-*) are most likely inappropriate, as are those
relating to routing (From, Bcc, Apparently-To, etc.)
Note that some headers are inherently unsafe to include in a message
generated from a URI. For example, headers such as "From:", "Bcc:",
and so on, should never be interpreted from a URI. In general, the
fewer headers interpreted from the URI, the less likely it is that a
sending agent will create an unsafe message.
Examples of problems with sending unapproved mail include:
mail that breaks laws upon delivery, such as making illegal
threats;
mail that identifies the sender as someone interested in breaking
laws;
mail that identifies the sender to an unwanted third party;
mail that causes a financial charge to be incurred on the sender;
mail that causes an action on the recipient machine that causes
damage that might be attributed to the sender.
Programs that interpret mailto URIs should ensure that the SMTP
"From" address is set and correct.
The security considerations of [STD66], [RFC3490], and also apply.
[RFC3987]
9. IANA Considerations
This document changes the definition of the mailto: URI scheme; the
registry of URI schemes should refer to this document rather than its
predecessor, [RFC2368].
Duerst & Masinter Expires August 18, 2005 [Page 10]
Internet-Draft The mailto URI scheme February 2005
10. Changes from RFC 2368
For interoperability with IRIs ([RFC3987]), allowed
percent-encoding, fixed to UTF-8, in the domain name part of an
email address, in LHS part of an address (currently reserved
because not operationally usable), and in hvalue parts.
Changed from 'URL' to 'URI'
Updated references: ABNF to [RFC2234]; message syntax to
[RFC2822], URI Generic Syntax to [STD66]
Expanded "#mailbox", because the "#" shortcut is no longer
available; needs checking
11. Acknowledgments
This document was derived from [RFC2368]; the acknowledgments from
this specification still applies. In addition, we thank Paul Hoffman
and Jamie Zawinsky for their work on [RFC2368].
Valuable input on this document was received from: Paul Hoffman.
12. References
12.1 Normative References
[RFC2045] Freed, N. and N. Borenstein, "Multipurpose Internet Mail
Extensions (MIME) Part One: Format of Internet Message
Bodies", November 1996.
[RFC2047] Moore, K., "MIME Part Three: Message Header Extensions for
Non-ASCII Text", RFC 2047, November 1996.
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", BCP 14, RFC 2119, March 1997.
[RFC2234] Crocker, D. and P. Overell, "Augmented BNF for Syntax
Specifications: ABNF", RFC 2234, November 1997.
[RFC2822] Resnik, P., "Internet Message Format", RFC 2822, April
2001.
[RFC3490] Faltstrom, P., Hoffman, P. and A. Costello,
"Internationalizing Domain Names in Applications (IDNA)",
RFC 3490, March 2003.
Duerst & Masinter Expires August 18, 2005 [Page 11]
Internet-Draft The mailto URI scheme February 2005
[RFC3491] Hoffman, P. and M. Blanchet, "Nameprep: A Stringprep
Profile for Internationalized Domain Names (IDN)",
RFC 3491, March 2003.
[RFC3987] Duerst, M. and M. Suignard, "Internationalized Resource
Identifiers (IRIs)", RFC 3987, January 2005.
[STD63] Yergeau, F., "UTF-8, a transformation format of ISO
10646", STD 63, RFC 3629, November 2003.
[STD66] Berners-Lee, T., Fielding, R. and L. Masinter, "Uniform
Resource Identifier (URI): Generic Syntax", STD 66,
RFC 3986, April 2004.
12.2 Informative References
[RFC2324] Masinter, L., "Hyper Text Coffee Pot Control Protocol
(HTCPCP/1.0)", RFC 2324, April 1998.
[RFC2368] Hoffman, P., Masinter, L. and J. Zawinski, "The mailto URL
scheme", RFC 2368, July 1998.
Authors' Addresses
Martin Duerst (Note: Please write "Duerst" with u-umlaut wherever
possible, for example as "Dürst" in XML and HTML.)
World Wide Web Consortium/Keio University
5322 Endo
Fujisawa, Kanagawa 252-8520
Japan
Phone: +81 466 49 1170
Fax: +81 466 49 1171
Email: mailto:duerst@w3.org
URI: http://www.w3.org/People/D%C3%BCrst/
Larry Masinter
Adobe Systems Incorporated
345 Park Ave
San Jose, CA 95110
USA
Phone: +1-408-536-3024
Email: LMM@acm.org
URI: http://larry.masinter.net/
Duerst & Masinter Expires August 18, 2005 [Page 12]
Internet-Draft The mailto URI scheme February 2005
Intellectual Property Statement
The IETF takes no position regarding the validity or scope of any
Intellectual Property Rights or other rights that might be claimed to
pertain to the implementation or use of the technology described in
this document or the extent to which any license under such rights
might or might not be available; nor does it represent that it has
made any independent effort to identify any such rights. Information
on the procedures with respect to rights in RFC documents can be
found in BCP 78 and BCP 79.
Copies of IPR disclosures made to the IETF Secretariat and any
assurances of licenses to be made available, or the result of an
attempt made to obtain a general license or permission for the use of
such proprietary rights by implementers or users of this
specification can be obtained from the IETF on-line IPR repository at
http://www.ietf.org/ipr.
The IETF invites any interested party to bring to its attention any
copyrights, patents or patent applications, or other proprietary
rights that may cover technology that may be required to implement
this standard. Please address the information to the IETF at
ietf-ipr@ietf.org.
Disclaimer of Validity
This document and the information contained herein are provided on an
"AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS
OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET
ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED,
INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE
INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED
WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
Copyright Statement
Copyright (C) The Internet Society (2005). This document is subject
to the rights, licenses and restrictions contained in BCP 78, and
except as set forth therein, the authors retain all their rights.
Acknowledgment
Funding for the RFC Editor function is currently provided by the
Internet Society.
Duerst & Masinter Expires August 18, 2005 [Page 13]
Received on Monday, 14 February 2005 11:59:00 UTC