W3C home > Mailing lists > Public > www-international@w3.org > January to March 2007

Re: For review: An Introduction to Multilingual Web Addresses

From: Martin Duerst <duerst@it.aoyama.ac.jp>
Date: Wed, 28 Mar 2007 13:50:07 +0900
Message-Id: <6.0.0.20.2.20070328134731.056be280@localhost>
To: Najib Tounsi <ntounsi@emi.ac.ma>
Cc: Richard Ishida <ishida@w3.org>, www-international@w3.org

Hi Najib,

At 02:25 07/03/28, Najib Tounsi wrote:

>Martin Duerst wrote:

>> Does such a mixture include labels that contain both Latin and RTL
>> characters, or are the scripts separated by dots? In the former
>> case, this would be even more peculiar, because such labels
>> (mixing RTL and LTR characters) are illegal in IDN.
>>   
>It  is one label that is a mixture of RTL end LTR chars. It is invalid of course.
>I've noted that not all browsers answer it is invalid.
>Here are the tests (http://www.w3c.org.ma/Tests/IDNs/Issue2.html)

Quite interesting.

>In fact, in a mail I sent before, it was question about:
>- Arabic & Hebrew IDNs that are not displayed as claimed by Firefox. .museum TLD is trusted (Firefox displays IDNs in native) but in punycode for Arabic & Hebrew. (http://www.w3c.org.ma/Tests/IDNs/Issue1.html)

Yes, that's weird.

>- IDNs with %xx in links (http://www.w3c.org.ma/Tests/IDNs/Issue3.html)
>Browsers accept links (href attribute) with IDNs in native and punycode but not in escaped notation.
>Are %xx encoding a valide notation  for IDNs?

Yes. RFC 3986 explicitly allows these, but fixes them to use
UTF-8. But not all browsers implement them yet.

Regards,     Martin.



#-#-#  Martin J. Du"rst, Assoc. Professor, Aoyama Gakuin University
#-#-#  http://www.sw.it.aoyama.ac.jp       mailto:duerst@it.aoyama.ac.jp     
Received on Wednesday, 28 March 2007 05:19:45 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 2 June 2009 19:17:10 GMT