Re: URIs and i18n

Martin Duerst 2008-07-30 05.49:

>> One of the biggest concerns is with script mixing, where
>> ASCII and several local scripts get intermingled in an IRI.
>> In my opinion, this is quite a bad thing, leads to a great
>> deal of user confusion and potential for phishing - it's one
>> of the biggest things that should be explicitly restricted (a
>> few languages exist where script mixing is required, but
>> these are finite and definable as exceptions).
> I understand the dangers of mixing several scripts (not
> necessarily including ASCII, which isn't a script anyway) in
> the same IRI component (e.g. label for a domain name). Using
> different components with different scripts isn't really much
> of a problem, except for very rare and special cases (e.g. a
> Cyrillic component that looks exactly like a Latin one in an
> otherwise Latin IRI).

A questions/comments in these regards:

The typhographic trend has long been to make the different 
alphabetical scripts of the world more simimlar. For instance 
modern Cyrillic has letters which are shaped similarely to Roman 
letters. After all, Times New Roman contains both Cyrillic and 
Romand - for instance.

This is fine, and creates a pretty Web pages, PDFs and paper 
documents because it create unified styles, regardless of the 
script in use.

But exactly for these security matters, it can hardly be a 
positive thing that Cyrillic look like Latin and vice-versa.

It is possible - however, and it is often done for style variation 
etc -  to write Cyrillic in a way that make the letters stand out 
from Latin.

Has this been considered? Are anyone proposing that? Should not 
use of fonts which clearly distinguish the different letters and 
scripts be adviced in these regards?

I think the main security risk involved with IRIs and IDNs is that 
the user is unable to judge for himself what he is reading.

Leif Halvard Silli

Received on Thursday, 31 July 2008 22:43:19 UTC