W3C home > Mailing lists > Public > www-international@w3.org > January to March 2014

RE: Universal Acceptance of IDN TLDs

From: Larry Masinter <masinter@adobe.com>
Date: Wed, 5 Mar 2014 14:39:50 +0000
To: Mark Davis ☕ <mark@macchiato.com>
CC: Andre Schappo <A.Schappo@lboro.ac.uk>, www International <www-international@w3.org>, Don Hollander <gm@aptld.org>, "public-iri@w3.org" <public-iri@w3.org>
Message-ID: <6de289ecc7ae4828a5899f4dcb28d006@BL2PR02MB307.namprd02.prod.outlook.com>
The handling of %xx-encoded domain names in DNS servers would be a fallback for use in legacy systems that are not IDN-aware.

So the length limit argument doesn’t carry a lot of weight – it is strictly a transitional deployment enhancement for working around legacy components which extract domain names from URIs but rcan only process 7-bit URIs and not 8-bit IRIs.

You can deploy IDNs when all of  the applications you care about will work for the users you care about for the DNS names you want to use.

Components that handle IRIs directly and pull out domain names for future processing shouldn’t ever need the %xx encoding, although decoding it is also a good idea.

From: mark.edward.davis@gmail.com [mailto:mark.edward.davis@gmail.com] On Behalf Of Mark Davis ?
Sent: Wednesday, March 05, 2014 2:16 PM
To: Larry Masinter
Cc: Andre Schappo; www International; Don Hollander; public-iri@w3.org
Subject: Re: Universal Acceptance of IDN TLDs

If you mean having the DNS system natively accept %xx for domain labels as well as Punycode, I suspect that that ship has long since sailed. (That was one of the options discussed, but was turned down because of the length limitations.)

If on the other hand, you mean that client software should accept %xx notation as well as straight Unicode and punycode, that is another story. That can be handled by a client-side mapping, permitted by either IDNA2008 or UTS46. (And I agree that it's a good idea.)

With that, I could type in my address bar any of:

  1.  xn--idna--x-l6c.blogspot.com<http://xn--idna--x-l6c.blogspot.com>
  2.  IDNA-ȿ-x.blogspot.com<http://xn--idna--x-l6c.blogspot.com>
  3.  IDNA-%C8%BF-x.blogspot.com<http://BF-x.blogspot.com>
  4.  IDNA-Ȿ-X.BLOGSPOT.COM<http://xn--idna--x-lt7e.BLOGSPOT.COM>
And they'd all resolve to xn--idna--x-l6c.blogspot.com<http://xn--idna--x-l6c.blogspot.com>.

  1.  I just checked on Chrome, and all of these work.
  2.  Firefox is a bit odd: if I type in the #3, it fails; *but* it converts it in the address bar, so a subsequent enter goes to the right place. #4/#5 just fail.
  3.  Don't know about other browsers.


— Il meglio è l’inimico del bene —

On Wed, Mar 5, 2014 at 2:32 PM, Larry Masinter <masinter@adobe.com<mailto:masinter@adobe.com>> wrote:
there’s a gap between IDN and URI in that IRI -> URI would prefer to use the %xx percent-hex URL encoding in general.

What would be preferable would be to insure that DNS requests for %xx encoded names is an acceptable alternative to punycode.

From: Andre Schappo [mailto:A.Schappo@lboro.ac.uk<mailto:A.Schappo@lboro.ac.uk>]
Sent: Tuesday, March 04, 2014 3:51 PM
To: www International
Cc: Don Hollander; public-iri@w3.org<mailto:public-iri@w3.org>
Subject: Re: Universal Acceptance of IDN TLDs

① Is this document available online? I have looked round http://aptld.org but cannot find it.

② There are indeed barriers to the effective, real world use of IDNs. A fundamental problem is that IDNs, in general, are not properly catered for and not properly integrated into systems. One reason often quoted for treating IDNs differently is "Security". Well, I posit that any IDN security issues pale in comparison to the ubiquitous "… for further information please click here."

Here are some examples from Social Media:


If the Unicode form is entered —

#test  http://北大.中国

It is not recognised as a Domain Name & not displayed as clickable link

If the punycode form is entered —

#test http://xn--djry4l.xn--fiqs8s

It is now recognised as a Domain Name and displayed as a clickable link but displayed as punycode instead of Unicode

Sina Weibo

Same results —
#test# http://北大.中国

#test# http://xn--djry4l.xn--fiqs8s

There is also the related issue of having to Percent Encode the Unicode pathname components of a URL.

③ In my experience, another fundamental problem is the lack of IT Internationalization teaching in Schools and Universities. Certainly in England, IT Internationalisation has not yet become an accepted part of the curriculum. We need to produce students that have an appreciation/understanding of IT Internationalisation in order to, amongst other goals, properly integrate IDNs into systems/apps/websites …etc…

For several years I have been teaching a module entitled "International Computing" which covers several aspects of IT i18n. One of the topics I cover is IDNs :) And I am keeping my students up to date with the idn new gTLDs as they are delegated to DNS Root :)

During my years teaching this module I have found few students (regardless of which country they come from) with even a basic appreciation of IT Internationalization because it is a topic that was never discussed/raised in their prior studies.

So, any initiative in "to improve the use of IDN TLDs in the real world" should get Universities onboard and encourage Universities/Schools to teach "IT Internationalization"


On 4 Mar 2014, at 12:14, Richard Ishida wrote:

I was contacted last week by Don Hollander, General Manager of the Asia Pacific Top Level Domain Association, who is trying to improve the use of IDN TLDs in the real world, and looking for support.

See the attached PDF (from him) outlining what are the barriers to the effective use of IDN TLDs and who can help address these issues.

He's hoping to create a community of interested stakeholders. He expects this community to include ICANN, many ccTLDs, ISOC, and hopefully commercial developers. He is also looking to set up some opportunities to meet and discuss how to move things forward.

If you are interested in getting involved, please raise your voice.

Don says "There is a HUGE population with interest in this - but it is not really the current 2Billion, but the next 2 Billion - those who aren’t yet connected."

<Addressing the issue of Universal Acceptance of IDN TLDs-1.pdf>





Received on Wednesday, 5 March 2014 14:40:25 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 22:41:04 UTC