Re: "International" email addresses [I18N-ACTION-374]

On 2014/11/21 04:36, Shawn Steele wrote:

[Anne van Kesteren wrote]
>> Just to be clear, note that HTML does support them as *input* (the UI done by the UA), it's just that it expects that to be translated to ASCII. This is not so different from how we deal with this situation when it comes to URLs.
>
> ASCII isn't supported for EAI, and it's typically preferred to keep Domain Names in Unicode except for that pesky resolving step, where they must be punycoded.

"ASCII isn't supported for EAI" is extremely short. What it means is 
that there is no ACE (ASCII-compatible encoding) for the left-hand side 
(LHS; the part before the @) of an EAI email address.

So for a purely hypothetical case of an address like 
café@café.example.com, it's possible to change the domain name part to 
xn--caf-dma.example.com (using punycode), but there is no such thing for 
internationalized email addresses.

In a mailto: URI, the above becomes
mailto:caf%C3%A9@caf%C3%A9.example.com,
but it's impossible to make the plain email address
caf%C3%A9@caf%C3%A9.example.com, because the '%' could be part of the 
left-hand side of another email address.

Also, while HTTP is, as per spec, limited to ASCII-only URIs, it's 
impossible to send anything to café@café.example.com without the EAI 
extension to SMTP, and with that extension, everything including the 
addressee's address is in raw UTF-8.

Given that the Web is mostly UTF-8 these days and is asymptotically 
approximating an UTF-8-only target state, and that EAI addresses are 
handled as plain UTF-8 throughout the EAI email infrastructure, trying 
to interpose a non-existing ASCII-only form between these two systems is 
a non-starter.

For cafe@café.example.com (LHS ASCII only), it may make sense to 
downgrade to cafe@xn--caf-dma.example.com, because then neither the 
logic and database behind the Web form nor the email infrastructure when 
the backend sends a mail to that address need any changes.

But for café@café.example.com, the situation is quite different. The 
backend has to make sure its mail sending infrastructure is updated to 
EAI. For that, it will also have to upgrade/update its backend logic and 
make sure the database takes and keeps the Unicode mail address (maybe 
in UTF-16 if not in UTF-8). So it is clear that it wants the address as 
UTF-8 rather than something else.

So "HTML does support them as *input*" is clearly not good enough and 
counterproductive for true EAI email addresses. Please fix, thanks!

Regards,   Martin.

Received on Tuesday, 25 November 2014 08:30:10 UTC