W3C home > Mailing lists > Public > www-international@w3.org > January to March 2005

Re: IDN problem.... :(

From: Douglas Davidson <ddavidso@apple.com>
Date: Mon, 14 Feb 2005 10:32:03 -0800
Message-ID: <9d7e5b9ad19262c37aa741fc8cd51ff0@dumuzi.apple.com>
To: Frank Yung-Fong Tang <ytang0648@aol.com>
Cc: www-international@w3.org, Unicode Mailing List <unicode@unicode.org>

On 2005-02-14 09:52:17 -0800 Frank Yung-Fong Tang <ytang0648@aol.com> 

> Isn't that true the IDN security issue we are now experienceing is 
> also true 
> for any other identy? (Like, IM id? can someone use "Bill G" + Greek 
> a + 
> "tes" in some IM communication to pretend he/she is the head of MS?)
> Shouldn't this be a identity security issue in the level of Unicode 
> Standard, 
> instead in the IDN level? in other words, we will have a mass if 
> every place 
> accept Unicode (as identiy, say user name) and render it properly as 
> what we 
> expect to see, if we don't start to work on some specification to 
> prevent 
> similar thing happen in other protocol/places.... Go back to the 
> root, it is 
> a cheating between the code point and the human visual recognization, 
> and it 
> could happen anywhere.


Perhaps we can take inspiration from something that we already have in 
mail.  For example, when I see your address above, it looks like 
"Frank Yung-Fong Tang <ytang0648@aol.com>".  In this the first part is 
clearly intended to be the human-readable portion, and it would be 
reasonable for you to put arbitrary Unicode in it--Chinese characters, 
for example.  The second part is just as clearly intended to be the 
authoritative machine-readable address.

In IDN we have something similar, with important differences.  There 
is a human-readable version of the domain name, and there is an 
encoded ASCII version.  The most significant difference here is that 
there is a standard round-trip conversion between the two.  However, 
this standard is showing certain failings, not in the round-trip 
conversion between ASCII encoding and Unicode, but rather in the other 
portion of the loop--from Unicode to glyphs on the screen to human 
readability and back to typing in.  These failings suggest that we 
should not place quite so much reliance on this conversion standard.

Perhaps we can develop a presentation form for IDN that would include 
both the human-readable Unicode and also the authoritative 
ASCII-encoded version, in a way similar to that used for email 
addresses.  This would make the Unicode available for readability, but 
it would also make it clear that the Unicode portion is not to be 
relied on as authoritative (at least by human readers) for 
distinguishing one name from another.  It would also supply the 
ASCII-encoded version for typing in, or copying and pasting--something 
that would be convenient in many cases, especially since many 
applications are not IDN-savvy, but also because some Unicode names 
will not be easy to reproduce accurately by typing.

Douglas Davidson
Received on Monday, 14 February 2005 18:32:38 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 22:40:50 UTC