- From: Frank Ellermann <nobody@xyzzy.claranet.de>
- Date: Mon, 14 Feb 2005 13:57:29 +0100
- To: uri@w3.org
Martin Duerst wrote: > The syntax of 'mailto' URIs from [RFC2368] is extended to > be compatible with IRIs ([RFC3987]) for better > internationalization. Fascinating. I expected that it would be near to impossible to fix 2368 for some decades. ;-) And I didn't know that the IANA URI-registry is incomplete, they don't have RfC 2324: | coffee-url = coffee-scheme ":" [ "//" host ] | ["/" pot-designator ] ["?" additions-list ] > the mailto URI scheme also allows setting mail header fields > and the message body. I'd like to add some minor points about this in 2368bis, the old 2368-remark... | Only the Subject, Keywords, and Body headers are believed to | be both safe and useful. ...does not exactly reflect common practice. Cc: is clearer than mailto:what@ever.example?to=an@other.example constructs. Maybe add it below the "to" example for "hname": > mailto:addr1%2C%20addr2 > is equivalent to > mailto:?to=addr1%2C%20addr2 > is equivalent to > mailto:addr1?to=addr2 The latter form is NOT RECOMMENDED. If the desired effect is to specify a secondary recipient mailto:addr1?cc=addr2 can be used. Back to your corrections: > A previous version of the mailto URI scheme had severe > limitations for non-ASCII characters. That's dubious. All you really do is to add unencoded IDNs, and IDNs didn't exist when 2368 was written. It's no "severe limitation", if <a href="mailto:martin@d%C3%BCrst.example"> might work with future browsers, when the equivalent form <a href="mailto:martin@xn--drst-0ra.example"> works today. Sure, the IRI version needs two bytes less than the punycode version in this example. OTOH it doesn't work with old user agents. [Found later in your draft: Okay, the body= UTF-8 stuff is really new, and that could be seen as a "severe limitation" today. I certainly don't miss it, my MUA does not support the body= feature at all.] > more straightforward and consistent internationalization. Yes, in theory. But not yet in practice, if I publish a mailto:-URL somewhere, then I want it to work for almost all users today. You explain this later in chapter 6. > hname = *urlc > hvalue = *urlc RfC 2368 and your draft use "urlc" without proper syntax or explanation, please add something like this: urlc = %d33-36 / %d38-60 / %d62 / %d64-126 RfC 3968 apparently says nothing about "<" and ">", is this as you want it ? Otherwise you get %d33-36 / %d38-59 / %d64-126. Plain text examples: mailto:no@body.example?subject=is%20<this>%20okay%3F <mailto:no@body.example?subject=%3Cthat%3E%20is%20clear!> BTW, please add a note about mailto-IRIs in documents, where the document charset is not UTF-8. If I got your draft right, the idea is to use percent-encoded UTF-8 even if the document charset is something else like Latin-1. Example, in this article I use Latin-1, but a <mailto:martin@dürst.example> is an invalid URL, and it's also no IRI. A <mailto:body@check.example?body=dürst> is also invalid here, or isn't it ? I'm not sure about these examples, there's no obvious technical problem with this body=dürst parameter of a 2368-mailto-URL in a Latin-1 article. > URI producers should provide these domain names in the IDNA > encoding, rather than percent-encoded, if they wish to > maximize interoperability with legacy mailto: URI > interpreters Indeed, unfortuately you can't say SHOULD here. > Percent-encoding in the LHS of an email address is reserved > for potential future internationalization. Non-ASCII > characters must first be encoded according to UTF-8 [STD63] The first statement is only correct for Non-ASCII, there's no general problem with percent-encoding in the LHS of addresses in mailto URLs. The "quoted string" case of a LHS can be very weird. > Within mailto URIs, the characters "?", "=", "&" are reserved. Maybe add a forward reference to chapter 5 here about NO-WS-CTL and WSP. I don't find a general rule about this issue in 3986, probably I'm missing something obvious (?). > 1. MIME encoded words (as defined in [RFC2047]) are permitted > in header values, but not in an hvalue of a "body" hname. That's clear. You aren't planning to invent a mailto-IRI-body, or are you ? Oops, I found body=caf%3C%A9 later, now that's a PITA, by using mailto-IRI-bodies the MUA is more or less forced to generate a Content-Type: text/plain;charset=utf-8 with QP or Base64. If you really want this, please say so not only in an example. This has side effects on systems, where the default local charset is _not_ Unicode (any of the UTFs). > MIME encoded words and UTF-8-based percent-encoding SHOULD not > both be used in the same hvalue. Maybe you need a MUST NOT here, and definitely a NOT. Examples: mailto:an@example?subject=%3D%3Fus-ascii%3FQ%3F1%3F%3D_2%3F%3D mailto:an@example?subject=%3D%3Fus-ascii%3FQ%3FD%C3%BCrst3F%3D Whatever that is, it's no Subject: 1?= 2 or Subject: dürst. (?) > The creator of a mailto URI cannot expect the resolver of a URI > to understand more than the "subject" and "body" headers. [...] Here's a place where you could explain, why clients should try to support in-reply-to, and how "URI producers" should use it. [in the examples:] > ?In-Reply-To=%3C3469A91.D10AF4C@example.com%3E> Here's the place, where you could say that this should be the Message-ID of the mail in question. One popular software gets this wrong and apparently uses the last Message-ID found in the References or in In-Reply-To to construct its mailto-URL. That confuses the threading of mail replies based on the mailto-URL. > Another way of expressing the same thing: > <mailto:?to=joe@example.com&cc=bob@example.com&body=hello> Please delete this example, it's ugly. You already have this variant in the paragraph about "to" as "hname". > Click <a > href="mailto:?to=joe@xyz.com&cc=bob@xyz.com&body=hello"> > mailto:?to=joe@xyz.com&cc=bob@xyz.com&body=hello</a> to > send a greeting message to Joe and Bob. I'd use an.example instead of xyz.com here, and replace the "to": Click <a href="mailto:joe@an.example?cc=bob@a.example&body=hello"> mailto:joe@an.example?cc=bob@an.example&body=hello</a> to send a greeting message to Joe with a copy to Bob. > mailto:user@example.org?subject=%3D%3Futf-8%3FQ%3Fcaf%3DC3%3DA9%3F%3D Maybe replace user@example.org by an@example if your examples are otherwise too long for RfC lines. > The software sending the email is not restricted to UTF-8, but > can use other encodings. It's more or less forced to stick to UTF-8 or maybe another UTF. Otherwise it would have to analyze the mailto-IRI-body assuming UTF-8 input. That's a major difference from traditional mailto- URLs. > The security considerations of [STD66], [RFC3490], and also > apply. [RFC3987] s/apply. [RFC3987]/[RFC3987] apply./ IMHO "also apply" is not good enough. Either add some of the worst examples like say illegal UTF-8 encodings and phishing, or urge the readers to really check out these "external" sources. Please add a note, that a plain text <URL:mailto:an@example> MUST NOT use any percent encoded UTF-8, and is by definition a "visible with any browser" URL, not an IRI. Bye, Frank
Received on Monday, 14 February 2005 17:33:16 UTC