W3C home > Mailing lists > Public > public-html-bugzilla@w3.org > February 2012

[Bug 15489] IDN email addresses should be converted to Punycode before validating them

From: <bugzilla@jessica.w3.org>
Date: Fri, 03 Feb 2012 06:44:39 +0000
To: public-html-bugzilla@w3.org
Message-Id: <E1RtCsx-00064L-Hh@jessica.w3.org>

Ian 'Hixie' Hickson <ian@hixie.ch> changed:

           What    |Removed                     |Added
             Status|NEW                         |RESOLVED
                 CC|                            |ian@hixie.ch
         Resolution|                            |WORKSFORME

--- Comment #10 from Ian 'Hixie' Hickson <ian@hixie.ch> 2012-02-03 06:44:37 UTC ---
(In reply to comment #0)
> Email addresses should be converted from Punycode to ASCII before validating
> them

Assuming you mean user input, that's what the spec says to do.

(In reply to comment #1)
> The spec currently says:
> > A valid e-mail address is a string that matches the ABNF production
> > 1*( atext / "." ) "@" ldh-str *( "." ldh-str ) where atext is defined
> > in RFC 5322 section 3.2.3, and ldh-str is defined in RFC 1034 section
> > 3.5. [ABNF] [RFC5322] [RFC1034]
> As of revision 6884 (http://html5.org/tools/web-apps-tracker?from=6883&to=6884)
> it even includes an example regular expression:
> > /^[a-zA-Z0-9.!#$%&'*+-/=?\^_`{|}~-]+@[a-zA-Z0-9-]+(?:\.[a-zA-Z0-9-]+)*$/
> This makes IDN email addresses like `foo@mañana.com` invalid, even though its
> ASCII-encoded counterpart `foo@xn--maana-pta.com` validates.

Yes. Note that the regular expression is irrelevant here, it's not normative.
IDN e-mail addresses have always been invalid here. This shouldn't affect
users, since any IDN e-mail addresses they enter should get converted to ASCII
before being used as the new value (which is what is validated).

> It’s probably not a good idea to force users to enter their IDN email addresses
> in Punycode format.

Agreed. The spec doesn't ask them to.

> How about defining that UAs should convert any IDN email
> address input to its Punycoded ASCII equivalent before validating email
> addresses (by applying this regex, for example)?

That's already what the spec suggests browsers do.

(In reply to comment #9)
> > [08:08] <Hixie> what's the use case? the value in the database would be punycoded
> > [08:09] <Hixie> since that's all the client will ever send to the server
> Why is that?

At the wire level, e-mails are sent using punycoded addresses. IDN addresses
are only a rendering-level thing.

> Because IDN email addresses are considered to be invalid?

I'm not sure what this means. Invalid by whom, in what context?

Configure bugmail: https://www.w3.org/Bugs/Public/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the QA contact for the bug.
Received on Friday, 3 February 2012 06:44:49 UTC

This archive was generated by hypermail 2.3.1 : Wednesday, 7 January 2015 16:31:26 UTC