FW: Feedback Requested: Unicode Security Considerations

Here are some personal comments, and some comments endorsed by the W3C's Core I18n WG (these are noted as such).


This is a very useful document.  Congratulations on pulling it together so well.


Richard Ishida

contact info:

W3C Internationalization:

Publication blog:

> -----Original Message-----
> From: unicore-bounce@unicode.org
> [mailto:unicore-bounce@unicode.org] On Behalf Of Mark Davis
> Sent: 18 June 2005 02:30
> To: UnicoRe Mailing List
> Subject: Feedback Requested: Unicode Security Considerations
> Importance: High
> The security subcommittee has been working on UTR#36: Unicode Security 
> Considerations and its associated data files. We would welcome review 
> comments at this point.
> Please look over the document and data files within your organization, 
> and send comments to security@unicode.org by 2005-06-27. We have a 
> short timetable, so the earlier your comments are in the better! The 
> document is at http://www.unicode.org/draft/reports/tr36/tr36.html.
> This document points at data files that are also available for review.
> However, to make things easier, we have put together a single combined 
> data file just for this review, at:
> http://www.unicode.org/draft/reports/tr36/data/review.txt. In that 
> file, for each code point currently allowed in international domain 
> names, it gives a breakdown according to the profile recommended by 
> TR36. Here are some sample lines with explanations:
> 00C0  ; input           # (À)  LATIN CAPITAL LETTER A WITH GRAVE
> - allow character U+00C0 on input (but it gets case-folded to an 
> output character by IDNA)
> 00AA  ; input-lenient   # (ª)  FEMININE ORDINAL INDICATOR
> - allow character U+00AA on lenient input (but it gets normalized to 
> an output character by IDNA)
> 0027  ; remap-to-2019   # (')  APOSTROPHE
> - remap the character U+0027 on input, to U+2019, before processing by 
> 002D  ; output          # (-)  HYPHEN-MINUS
> - allow character U+002D in output (the result of IDNA)
> 00A1  ; prohibited ; not in XID+  # (¡)  INVERTED EXCLAMATION MARK
> - prohibit character U+00A1; a shorthand reason is in field 3. In this 
> case, "not in XID+" means that it doesn't follow the Unicode 
> identifier guidelines in UAX#31.
> The document and associated data files are 'live'; they may be updated 
> during the course of this review. We'd appreciate it if you send the 
> revision number of the file with your comments. You will find it in 
> the header, in the form
> "$Revision: 1.2 $". (Most of the time having this won't matter, but 
> just in case...).
> The confusables.txt data file is still being worked on, and is not yet 
> ready for productive review. A separate note will be sent when it is 
> ready for review.
> ‎Mark

Received on Tuesday, 12 July 2005 15:16:28 UTC