- From: Smith, Kevin, VF-Group <Kevin.Smith@vodafone.com>
- Date: Thu, 10 Apr 2008 09:44:36 +0200
- To: "Felix Sasaki" <fsasaki@w3.org>, "Phil Archer" <parcher@icra.org>
- Cc: "Eric Prud'hommeaux" <eric@w3.org>, <public-powderwg@w3.org>, <public-i18n-core@w3.org>
HI Felix and Eric, A quick question, and please forgive my ignorance: it seems possible that http://www.xn--exmple-jua.org/ and http://www.exåmple.org/ ...could be completely separate domains, i.e., one person in the US buys http://www.xn--exmple-jua.org/ , one in Denmark buys http://www.exåmple.org/ . If that situation can arise, then how can we be sure that http://www.xn--exmple-jua.org/ is in fact http://www.exåmple.org/ ? I am assuming that to a domain reseller, they are simply selling a domain name which consists if a string (less any reserved characters), and hence it would be possible to buy a punycoded version of an IRI. Many thanks Kevin -----Original Message----- From: public-powderwg-request@w3.org [mailto:public-powderwg-request@w3.org] On Behalf Of Felix Sasaki Sent: 10 April 2008 07:01 To: Phil Archer Cc: Eric Prud'hommeaux; public-powderwg@w3.org; public-i18n-core@w3.org Subject: Re: §2.1.3 IRI/URI Canonicalization does not address IRIs with IDNs Hi Phil, I was looking into this section in your attachment: [ 2.1.3.4 Internationalized Domain Names * Internationalized Domain Names (IDNs) should be converted from Punycode [RFC3492] into their UTF-8 string representations. So that, for example: http://www.xn--exmple-jua.org/ becomes http://www.exåmple.org/. ] If you have http://www.xn--exmpless-jua.org/ It is not possible to decide whether it should become http://www.exåmpless.org/ or http://www.exåmpleß.org/ since "ss" in the Punycode string could have been originally "ss" or "ß". So I think this canonicalization step is not feasible. I'm also not sure if it is necessary: If you get http://www.xn--exmpless-jua.org/ you could process it in Powder just "as is", without trying to go to the representation with non-ASCII characters. The same for http://www.exåmpless.org/ . But maybe I missing something? Just let me know what you think. Note that the problem of the unidirectional relation between "ß" and "ss" is a problem of IDNs which will soon be addressed by a proposed IETF Working Group, see http://www.alvestrand.no/pipermail/idna-update/2008-March/001343.html Felix
Received on Thursday, 10 April 2008 07:46:19 UTC