Re: [EAI] [Fwd: AD review of draft-duerst-mailto-bis-06.txt] from Martin J. Dürst on 2009-10-27 (public-iri@w3.org from October 2009)

From: Martin J. Dürst <duerst@it.aoyama.ac.jp>
Date: Tue, 27 Oct 2009 12:56:33 +0900
To: Ted Hardie <ted.ietf@gmail.com>
CC: Shawn Steele <Shawn.Steele@microsoft.com>, "ima@ietf.org" <ima@ietf.org>, "public-iri@w3.org" <public-iri@w3.org>
Message-ID: <4AE66F71.1060106@it.aoyama.ac.jp>
[copying public-iri@w3.org]

Hello Ted, others,

[everybody: There is a short summary at the end of this mail.]

On 2009/10/27 5:03, Ted Hardie wrote:
> On Mon, Oct 26, 2009 at 12:26 PM, Shawn Steele
> <Shawn.Steele@microsoft.com>  wrote:
>>   On my machine if I open "run", then type mailto:shäwn, then Outlook
>> opens up with shäwn in the To: line.  Same thing happens if I stick it
>> in an href in an HTML document.  I think I even tried it with a
>> different browser (sorry, don't remember which one, don't have others
>> installed at the moment).  Of course I couldn't actually send the
>> mail, but the "mailto" part worked.
>
> For "the 'mailto part" to be considered to have worked, the mail has
> to get end to end, at least in my opinion.

At first sight, this seems reasonable. But it's basically the same 
difference as the difference between an http: URI/IRI that is 
*syntactically* correct and one that *actually resolves* (i.e. not 
produces a 404 or something similar).

> The problem here is fundamentally that some part of the system has to
> take what the user thinks is correct and turn into something that the
> mail system can deliver.  The more pieces we allow to contain "what
> the user thinks is correct" rather than "what the mail system can
> deliver", the further down into the system any translation between the
> two must occur.

There are two problems with mailto:shäwn:

1) There's no At-sign
2) In particular for LHS, shäwn only works under EAI

For your everyday email user agent, both of these mean that the mail 
won't be sent.


> Unless "shäwn" is a valid email address, showing that in a protocol
> slot (which mailto is) seems like the wrong trade-off to me.

First, the current draft clearly says that an at-sign is needed, by 
using <addr-spec> (http://tools.ietf.org/html/rfc5322#section-3.4.1).
So mailto:shäwn is not a valid IRI even according to an updated mailto: 
spec. So to conform to the spec, we may have to try with something like
mailto:shawn@shäwn.com or so.

But the question is where this should be checked. An important principle 
that many people often forget is that IRIs/URIs are just "carriers". 
Because there are many schemes, it is completely unrealistic to expect a 
generic IRI/URI handler to check these. So checking is up to the 
application that "resolves" the mailto: IRI/URI, which is the mail user 
agent.

Now what does the average mail user agent do if you put "shäwn" into a 
To: field? I can only report about Thunderbird (Eudora version, 
3.0b1pre). It absolutely has no problem putting "shäwn" into a To: field 
(I have to admit that I actually tested with a Cc: field, but I don't 
think there would be any difference for a To: field). Thinking about it, 
that's quite understandable, it also shouldn't produce an error if I put 
"Dürst” into such a field, in particular if I continue input with " 
<duerst@it.aoyama.ac.jp>". The "Dürst” will result in an encoded word, 
but it would be weird to ask the user to input that encoded word, or to 
show that encoded word to the user.

[The syntax proposed in the draft, as far as I understand, excludes 
something like "name <lhs@rhs>" anyway. This is different in RFC 2368, 
which I thinks allows this, but I got told that this wasn't actually 
supported well, and RFC 2368's predecessor (RFC 1738) also didn't allow 
it. (see http://tools.ietf.org/html/rfc1738#section-3.5)]

When putting "shäwn" into a Cc: field, what happened is that when I 
tried to send the mail, there was an error message that to me suggested 
that between the mail user agent and the server, the non-7bit byte 
caused a connection abort. Not a very helpful message for a general 
user, but not an issue for the mailto: spec. I have no idea whether 
other mail user agents do better here (e.g. checking and telling the 
user that the address isn't well-formed before they actually try to send 
something).

On top of all the above thoughts, if we want to claim:

 > For "the 'mailto part" to be considered to have worked, the mail has
 > to get end to end, at least in my opinion.

then that would also exclude things such as
mailto:nobody@nonexistent.example.com
would also not be valid mailto: IRIs/URIs. I hope you agree that it 
doesn't make sense to actually send an email just to figure out whether 
a mailto: URI/IRI is valid or not. So I don't think it makes sense to 
include the existence of a mail address a precondition for the validity 
of a mailto: URI.

Also, the draft does not contain any syntax restrictions for any of the 
other fields (body, Subject:, To:, Cc:, Bcc:,...). So according to the 
draft, there has to be an at-sign before the '?', but there is no check 
for an at-sign in the "foo" part of mailto:a@b.c?cc=foo.


So in summary:

- It's the responsibility of the resolver (in this case the mail user 
agent), not some generic IRI/URI software, to check for possible syntax 
problems in the IRI/URI.

- There's a whole series of different cases ranging from a fully 
workable example to a syntactically totally invalid example. Therefore, 
problems may be detected (or show up) sooner or later.

- IDNA-aware slots, or EAI, may be a different protocol from old-style 
SMTP without any extensions, but the former use (somewhat different) 
protocol elements nevertheless.

- If and where interoperability can be achieved with a protocol element 
that is closer to the user (i.e. an IDN, or an IRI,...), there is no 
reason to use a lower-level protocol element (e.g. xn--... or lots of 
%-escapes or a mime encoded word (maybe with %-escapes on top of that).


> YMMV; offer not good in jurisdictions legislating the value of pi.

IMHO, offer not good even in jurisdictions that leave the value of pi to 
mathematicians :-)


Regards,   Martin.


-- 
#-# Martin J. Dürst, Professor, Aoyama Gakuin University
#-# http://www.sw.it.aoyama.ac.jp   mailto:duerst@it.aoyama.ac.jp
Received on Tuesday, 27 October 2009 03:57:23 UTC