Re: For review: An Introduction to Multilingual Web Addresses

Hi Najib,

At 02:25 07/03/28, Najib Tounsi wrote:

>Martin Duerst wrote:

>> Does such a mixture include labels that contain both Latin and RTL
>> characters, or are the scripts separated by dots? In the former
>> case, this would be even more peculiar, because such labels
>> (mixing RTL and LTR characters) are illegal in IDN.
>>   
>It  is one label that is a mixture of RTL end LTR chars. It is invalid of course.
>I've noted that not all browsers answer it is invalid.
>Here are the tests (http://www.w3c.org.ma/Tests/IDNs/Issue2.html)

Quite interesting.

>In fact, in a mail I sent before, it was question about:
>- Arabic & Hebrew IDNs that are not displayed as claimed by Firefox. .museum TLD is trusted (Firefox displays IDNs in native) but in punycode for Arabic & Hebrew. (http://www.w3c.org.ma/Tests/IDNs/Issue1.html)

Yes, that's weird.

>- IDNs with %xx in links (http://www.w3c.org.ma/Tests/IDNs/Issue3.html)
>Browsers accept links (href attribute) with IDNs in native and punycode but not in escaped notation.
>Are %xx encoding a valide notation  for IDNs?

Yes. RFC 3986 explicitly allows these, but fixes them to use
UTF-8. But not all browsers implement them yet.

Regards,     Martin.



#-#-#  Martin J. Du"rst, Assoc. Professor, Aoyama Gakuin University
#-#-#  http://www.sw.it.aoyama.ac.jp       mailto:duerst@it.aoyama.ac.jp     

Received on Wednesday, 28 March 2007 05:19:45 UTC