- From: <bugzilla@jessica.w3.org>
- Date: Mon, 14 May 2012 23:26:59 +0000
- To: public-i18n-core@w3.org
https://www.w3.org/Bugs/Public/show_bug.cgi?id=15489 Norbert Lindenberg <w3-bugs@norbertlindenberg.com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |public-i18n-core@w3.org --- Comment #15 from Norbert Lindenberg <w3-bugs@norbertlindenberg.com> 2012-05-14 23:26:58 UTC --- I don't agree with the statement "IDN is only a rendering-level/UI-level feature", and think that internationalized domain names should be allowed in email addresses in the value attribute of <input> elements. IDNA (its full name, with the "A" standing for "applications") was designed to enable the use of full Unicode in domain names within applications, while providing a mapping to an ASCII form for use with older protocols that aren't IDNA-aware (e.g., DNS and SMTP). Applications generally benefit from using the plain Unicode form of strings wherever possible. Older protocols and file formats require a variety of ASCII-based transformations of Unicode - e.g., the string "中国" might show up as "xn--fiqs8s", "%E4%B8%AD%E5%9B%BD", "\u4E2D\u56FD", "中国". Keeping these around and storing them in databases tends to cause problems - searching and sorting don't work properly because comparison functions don't know that "xn--fiqs8s" and "%E4%B8%AD%E5%9B%BD" mean the same, and duplicate or missing decoding later on can lead to mojibake. To maintain sanity, applications are better off converting text to plain Unicode when they receive it, and converting it to the appropriate ASCII-based transformations only when passing it on to a service that doesn't support Unicode (such as addresses for SMTP). The question here then is whether the email address in the value attribute of the <input> element with type=email should be part of the Unicode-aware application world, or part of the dumb ASCII-only protocol world. In a similar situation, it's already been decided that the URLs in the href attribute of the <a> and <link> elements, as well as the src attributes of the <script> and <img> elements, can be IRIs and thus include internationalized domain name labels. I don't see why the same shouldn't be allowed for the value attribute of the <input> element with type=email. As a consequence, user agents then *must* convert email addresses that contain IDN labels to the equivalent ASCII form before validating the addresses based on their ASCII form specification. Note also that the usage of the word "punycode" in the spec is wrong - Punycode is just one function of several used in the conversion from a U-label to an A-label: http://tools.ietf.org/html/rfc5890#section-2.3.4 -- Configure bugmail: https://www.w3.org/Bugs/Public/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
Received on Monday, 14 May 2012 23:27:01 UTC