RE: §2.1.3 IRI/URI Canonicalization does not address IRIs with IDNs

HI Felix and Eric,

A quick question, and please forgive my ignorance: it seems possible that


...could be completely separate domains, i.e., one person in the US buys , one in Denmark buys http://www.exå . If that situation can arise, then how can we be sure that is in fact http://www.exå ? I am assuming that to a domain reseller, they are simply selling a domain name which consists if a string (less any reserved characters), and hence it would be possible to buy a punycoded version of an IRI.

Many thanks

-----Original Message-----
From: [] On Behalf Of Felix Sasaki
Sent: 10 April 2008 07:01
To: Phil Archer
Cc: Eric Prud'hommeaux;;
Subject: Re: §2.1.3 IRI/URI Canonicalization does not address IRIs with IDNs

Hi Phil,

I was looking into this section in your attachment:

[ Internationalized Domain Names
    * Internationalized Domain Names (IDNs) should be converted from Punycode [RFC3492] into their UTF-8 string representations. So that, for

If you have
It is not possible to decide whether it should become http://www.exå or http://www.exåmpleß.org/ since "ss" in the Punycode string could have been originally "ss" or "ß".
So I think this canonicalization step is not feasible. I'm also not sure if it is necessary: If you get you could process it in Powder just "as is", without trying to go to the representation with non-ASCII characters. The same for http://www.exå . But maybe I missing something?

Just let me know what you think. Note that the problem of the unidirectional relation between "ß" and "ss" is a problem of IDNs which will soon be addressed by a proposed IETF Working Group, see


Received on Thursday, 10 April 2008 07:46:19 UTC