Personally I don’t much see the point. If it resolves it’s valid. If it doesn’t then most apps could care less if it’s well formed.
There are a very few applications that actually assign these things, and those certainly need to be able to understand the rules of their domains, but I’m not sure that’s a general problem.
-Shawn
From: Asmus Freytag [mailto:asmusf@ix.netcom.com]
Sent: Thursday, November 20, 2014 9:05 AM
To: Phillips, Addison; Steven Pemberton
Cc: www-international@w3.org; Forms WG
Subject: Re: "International" email addresses [I18N-ACTION-374]
On 11/20/2014 8:37 AM, Phillips, Addison wrote:
As Shawn and John note, a regex description of IDNA is probably impossible. At best, such a regex would be an approximation.
The problem is not that it's impossible to do a rigorous description (essentially a regex) of the IDN rules for a given zone, but that the description varies along the tree, and that the knowledge of the rules that apply at each level is imperfect.
As I mentioned, there's an effort underway to define an XML format that allows one to capture any known descriptions in (essentially) a regex-like format expressed in XML that can be parsed and evaluated by a common engine.
If/when IANA's registry gets converted to this format, you should be able to do IDN validation, down to the second level at least, to any level of desired accuracy by querying the correct tables (or able to build approximate regexes with known degrees of accuracy - because you could then test them against any published full specifications).
Anyway, you find a draft here: https://datatracker.ietf.org/doc/draft-davies-idntables/
A./