- From: Frank Ellermann <nobody@xyzzy.claranet.de>
- Date: Mon, 14 Feb 2005 13:57:29 +0100
- To: uri@w3.org
Martin Duerst wrote:
> The syntax of 'mailto' URIs from [RFC2368] is extended to
> be compatible with IRIs ([RFC3987]) for better
> internationalization.
Fascinating. I expected that it would be near to impossible
to fix 2368 for some decades. ;-) And I didn't know that the
IANA URI-registry is incomplete, they don't have RfC 2324:
| coffee-url = coffee-scheme ":" [ "//" host ]
| ["/" pot-designator ] ["?" additions-list ]
> the mailto URI scheme also allows setting mail header fields
> and the message body.
I'd like to add some minor points about this in 2368bis, the
old 2368-remark...
| Only the Subject, Keywords, and Body headers are believed to
| be both safe and useful.
...does not exactly reflect common practice. Cc: is clearer
than mailto:what@ever.example?to=an@other.example constructs.
Maybe add it below the "to" example for "hname":
> mailto:addr1%2C%20addr2
> is equivalent to
> mailto:?to=addr1%2C%20addr2
> is equivalent to
> mailto:addr1?to=addr2
The latter form is NOT RECOMMENDED. If the desired effect
is to specify a secondary recipient mailto:addr1?cc=addr2
can be used.
Back to your corrections:
> A previous version of the mailto URI scheme had severe
> limitations for non-ASCII characters.
That's dubious. All you really do is to add unencoded IDNs,
and IDNs didn't exist when 2368 was written. It's no "severe
limitation", if <a href="mailto:martin@d%C3%BCrst.example">
might work with future browsers, when the equivalent form
<a href="mailto:martin@xn--drst-0ra.example"> works today.
Sure, the IRI version needs two bytes less than the punycode
version in this example. OTOH it doesn't work with old user
agents.
[Found later in your draft: Okay, the body= UTF-8 stuff is
really new, and that could be seen as a "severe limitation"
today. I certainly don't miss it, my MUA does not support
the body= feature at all.]
> more straightforward and consistent internationalization.
Yes, in theory. But not yet in practice, if I publish a
mailto:-URL somewhere, then I want it to work for almost all
users today. You explain this later in chapter 6.
> hname = *urlc
> hvalue = *urlc
RfC 2368 and your draft use "urlc" without proper syntax or
explanation, please add something like this:
urlc = %d33-36 / %d38-60 / %d62 / %d64-126
RfC 3968 apparently says nothing about "<" and ">", is this as
you want it ? Otherwise you get %d33-36 / %d38-59 / %d64-126.
Plain text examples:
mailto:no@body.example?subject=is%20<this>%20okay%3F
<mailto:no@body.example?subject=%3Cthat%3E%20is%20clear!>
BTW, please add a note about mailto-IRIs in documents, where
the document charset is not UTF-8. If I got your draft right,
the idea is to use percent-encoded UTF-8 even if the document
charset is something else like Latin-1. Example, in this
article I use Latin-1, but a <mailto:martin@dürst.example> is
an invalid URL, and it's also no IRI.
A <mailto:body@check.example?body=dürst> is also invalid here,
or isn't it ? I'm not sure about these examples, there's no
obvious technical problem with this body=dürst parameter of a
2368-mailto-URL in a Latin-1 article.
> URI producers should provide these domain names in the IDNA
> encoding, rather than percent-encoded, if they wish to
> maximize interoperability with legacy mailto: URI
> interpreters
Indeed, unfortuately you can't say SHOULD here.
> Percent-encoding in the LHS of an email address is reserved
> for potential future internationalization. Non-ASCII
> characters must first be encoded according to UTF-8 [STD63]
The first statement is only correct for Non-ASCII, there's no
general problem with percent-encoding in the LHS of addresses
in mailto URLs. The "quoted string" case of a LHS can be very
weird.
> Within mailto URIs, the characters "?", "=", "&" are reserved.
Maybe add a forward reference to chapter 5 here about NO-WS-CTL
and WSP. I don't find a general rule about this issue in 3986,
probably I'm missing something obvious (?).
> 1. MIME encoded words (as defined in [RFC2047]) are permitted
> in header values, but not in an hvalue of a "body" hname.
That's clear. You aren't planning to invent a mailto-IRI-body,
or are you ? Oops, I found body=caf%3C%A9 later, now that's a
PITA, by using mailto-IRI-bodies the MUA is more or less forced
to generate a Content-Type: text/plain;charset=utf-8 with QP or
Base64. If you really want this, please say so not only in an
example. This has side effects on systems, where the default
local charset is _not_ Unicode (any of the UTFs).
> MIME encoded words and UTF-8-based percent-encoding SHOULD not
> both be used in the same hvalue.
Maybe you need a MUST NOT here, and definitely a NOT. Examples:
mailto:an@example?subject=%3D%3Fus-ascii%3FQ%3F1%3F%3D_2%3F%3D
mailto:an@example?subject=%3D%3Fus-ascii%3FQ%3FD%C3%BCrst3F%3D
Whatever that is, it's no Subject: 1?= 2 or Subject: dürst. (?)
> The creator of a mailto URI cannot expect the resolver of a URI
> to understand more than the "subject" and "body" headers.
[...]
Here's a place where you could explain, why clients should try
to support in-reply-to, and how "URI producers" should use it.
[in the examples:]
> ?In-Reply-To=%3C3469A91.D10AF4C@example.com%3E>
Here's the place, where you could say that this should be the
Message-ID of the mail in question. One popular software gets
this wrong and apparently uses the last Message-ID found in the
References or in In-Reply-To to construct its mailto-URL. That
confuses the threading of mail replies based on the mailto-URL.
> Another way of expressing the same thing:
> <mailto:?to=joe@example.com&cc=bob@example.com&body=hello>
Please delete this example, it's ugly. You already have this
variant in the paragraph about "to" as "hname".
> Click <a
> href="mailto:?to=joe@xyz.com&cc=bob@xyz.com&body=hello">
> mailto:?to=joe@xyz.com&cc=bob@xyz.com&body=hello</a> to
> send a greeting message to Joe and Bob.
I'd use an.example instead of xyz.com here, and replace the "to":
Click <a
href="mailto:joe@an.example?cc=bob@a.example&body=hello">
mailto:joe@an.example?cc=bob@an.example&body=hello</a> to
send a greeting message to Joe with a copy to Bob.
> mailto:user@example.org?subject=%3D%3Futf-8%3FQ%3Fcaf%3DC3%3DA9%3F%3D
Maybe replace user@example.org by an@example if your examples
are otherwise too long for RfC lines.
> The software sending the email is not restricted to UTF-8, but
> can use other encodings.
It's more or less forced to stick to UTF-8 or maybe another UTF.
Otherwise it would have to analyze the mailto-IRI-body assuming
UTF-8 input. That's a major difference from traditional mailto-
URLs.
> The security considerations of [STD66], [RFC3490], and also
> apply. [RFC3987]
s/apply. [RFC3987]/[RFC3987] apply./
IMHO "also apply" is not good enough. Either add some of the
worst examples like say illegal UTF-8 encodings and phishing, or
urge the readers to really check out these "external" sources.
Please add a note, that a plain text <URL:mailto:an@example>
MUST NOT use any percent encoded UTF-8, and is by definition
a "visible with any browser" URL, not an IRI.
Bye, Frank
Received on Monday, 14 February 2005 17:33:16 UTC