On 11/20/2014 8:37 AM, Phillips, Addison wrote:
> As Shawn and John note, a regex description of IDNA is probably
> impossible. At best, such a regex would be an approximation.
>
>
The problem is not that it's impossible to do a rigorous description
(essentially a regex) of the IDN rules for a given zone, but that the
description varies along the tree, and that the knowledge of the rules
that apply at each level is imperfect.
As I mentioned, there's an effort underway to define an XML format that
allows one to capture any known descriptions in (essentially) a
regex-like format expressed in XML that can be parsed and evaluated by a
common engine.
If/when IANA's registry gets converted to this format, you should be
able to do IDN validation, down to the second level at least, to any
level of desired accuracy by querying the correct tables (or able to
build approximate regexes with known degrees of accuracy - because you
could then test them against any published full specifications).
Anyway, you find a draft here:
https://datatracker.ietf.org/doc/draft-davies-idntables/
A./