RE: Feedback Requested: Unicode Security Considerations

Here are some personal comments, and some comments endorsed by the W3C's Core I18n WG (these are noted as such).

This is a very useful document.  Congratulations on pulling it together so well.


Richard Ishida

contact info: 

W3C Internationalization: 

Publication blog:

> -----Original Message-----
> From: 
> [] On Behalf Of Mark Davis
> Sent: 18 June 2005 02:30
> To: UnicoRe Mailing List
> Subject: Feedback Requested: Unicode Security Considerations
> Importance: High
> The security subcommittee has been working on UTR#36: Unicode 
> Security Considerations and its associated data files. We 
> would welcome review comments at this point.
> Please look over the document and data files within your 
> organization, and send comments to by 
> 2005-06-27. We have a short timetable, so the earlier your 
> comments are in the better! The document is at 
> This document points at data files that are also available for review.
> However, to make things easier, we have put together a single 
> combined data file just for this review, at:
> In 
> that file, for each code point currently allowed in 
> international domain names, it gives a breakdown according to 
> the profile recommended by TR36. Here are some sample lines 
> with explanations:
> 00C0  ; input           # (À)  LATIN CAPITAL LETTER A WITH GRAVE
> - allow character U+00C0 on input (but it gets case-folded to 
> an output character by IDNA)
> 00AA  ; input-lenient   # (ª)  FEMININE ORDINAL INDICATOR
> - allow character U+00AA on lenient input (but it gets 
> normalized to an output character by IDNA)
> 0027  ; remap-to-2019   # (')  APOSTROPHE
> - remap the character U+0027 on input, to U+2019, before 
> processing by IDNA
> 002D  ; output          # (-)  HYPHEN-MINUS
> - allow character U+002D in output (the result of IDNA)
> 00A1  ; prohibited ; not in XID+  # (¡)  INVERTED EXCLAMATION MARK
> - prohibit character U+00A1; a shorthand reason is in field 
> 3. In this case, "not in XID+" means that it doesn't follow 
> the Unicode identifier guidelines in UAX#31.
> The document and associated data files are 'live'; they may 
> be updated during the course of this review. We'd appreciate 
> it if you send the revision number of the file with your 
> comments. You will find it in the header, in the form 
> "$Revision: 1.2 $". (Most of the time having this won't 
> matter, but just in case...).
> The confusables.txt data file is still being worked on, and 
> is not yet ready for productive review. A separate note will 
> be sent when it is ready for review.
> ‎Mark

Received on Wednesday, 6 July 2005 12:04:36 UTC