- From: Martin J. Dürst <duerst@it.aoyama.ac.jp>
- Date: Tue, 25 Nov 2014 17:29:38 +0900
- To: Anne van Kesteren <annevk@annevk.nl>
- CC: Shawn Steele <Shawn.Steele@microsoft.com>, "Phillips, Addison" <addison@lab126.com>, Steven Pemberton <Steven.Pemberton@cwi.nl>, "www-international@w3.org" <www-international@w3.org>, Forms WG <public-forms@w3.org>
On 2014/11/21 04:36, Shawn Steele wrote: [Anne van Kesteren wrote] >> Just to be clear, note that HTML does support them as *input* (the UI done by the UA), it's just that it expects that to be translated to ASCII. This is not so different from how we deal with this situation when it comes to URLs. > > ASCII isn't supported for EAI, and it's typically preferred to keep Domain Names in Unicode except for that pesky resolving step, where they must be punycoded. "ASCII isn't supported for EAI" is extremely short. What it means is that there is no ACE (ASCII-compatible encoding) for the left-hand side (LHS; the part before the @) of an EAI email address. So for a purely hypothetical case of an address like café@café.example.com, it's possible to change the domain name part to xn--caf-dma.example.com (using punycode), but there is no such thing for internationalized email addresses. In a mailto: URI, the above becomes mailto:caf%C3%A9@caf%C3%A9.example.com, but it's impossible to make the plain email address caf%C3%A9@caf%C3%A9.example.com, because the '%' could be part of the left-hand side of another email address. Also, while HTTP is, as per spec, limited to ASCII-only URIs, it's impossible to send anything to café@café.example.com without the EAI extension to SMTP, and with that extension, everything including the addressee's address is in raw UTF-8. Given that the Web is mostly UTF-8 these days and is asymptotically approximating an UTF-8-only target state, and that EAI addresses are handled as plain UTF-8 throughout the EAI email infrastructure, trying to interpose a non-existing ASCII-only form between these two systems is a non-starter. For cafe@café.example.com (LHS ASCII only), it may make sense to downgrade to cafe@xn--caf-dma.example.com, because then neither the logic and database behind the Web form nor the email infrastructure when the backend sends a mail to that address need any changes. But for café@café.example.com, the situation is quite different. The backend has to make sure its mail sending infrastructure is updated to EAI. For that, it will also have to upgrade/update its backend logic and make sure the database takes and keeps the Unicode mail address (maybe in UTF-16 if not in UTF-8). So it is clear that it wants the address as UTF-8 rather than something else. So "HTML does support them as *input*" is clearly not good enough and counterproductive for true EAI email addresses. Please fix, thanks! Regards, Martin.
Received on Tuesday, 25 November 2014 08:30:10 UTC