W3C home > Mailing lists > Public > public-i18n-geo@w3.org > July 2005

RE: Feedback Requested: Unicode Security Considerations

From: Richard Ishida <ishida@w3.org>
Date: Wed, 6 Jul 2005 13:04:31 +0100
To: "'Mark Davis'" <mark.davis@jtcsv.com>, "'UnicoRe Mailing List'" <unicore@unicode.org>
Cc: "'GEO'" <public-i18n-geo@w3.org>
Message-Id: <20050706120430.B8C024F2C9@homer.w3.org>

Here are some personal comments, and some comments endorsed by the W3C's Core I18n WG (these are noted as such).


This is a very useful document.  Congratulations on pulling it together so well.


Richard Ishida

contact info:

W3C Internationalization:

Publication blog:

> -----Original Message-----
> From: unicore-bounce@unicode.org 
> [mailto:unicore-bounce@unicode.org] On Behalf Of Mark Davis
> Sent: 18 June 2005 02:30
> To: UnicoRe Mailing List
> Subject: Feedback Requested: Unicode Security Considerations
> Importance: High
> The security subcommittee has been working on UTR#36: Unicode 
> Security Considerations and its associated data files. We 
> would welcome review comments at this point.
> Please look over the document and data files within your 
> organization, and send comments to security@unicode.org by 
> 2005-06-27. We have a short timetable, so the earlier your 
> comments are in the better! The document is at 
> http://www.unicode.org/draft/reports/tr36/tr36.html.
> This document points at data files that are also available for review.
> However, to make things easier, we have put together a single 
> combined data file just for this review, at:
> http://www.unicode.org/draft/reports/tr36/data/review.txt. In 
> that file, for each code point currently allowed in 
> international domain names, it gives a breakdown according to 
> the profile recommended by TR36. Here are some sample lines 
> with explanations:
> 00C0  ; input           # (À)  LATIN CAPITAL LETTER A WITH GRAVE
> - allow character U+00C0 on input (but it gets case-folded to 
> an output character by IDNA)
> 00AA  ; input-lenient   # (ª)  FEMININE ORDINAL INDICATOR
> - allow character U+00AA on lenient input (but it gets 
> normalized to an output character by IDNA)
> 0027  ; remap-to-2019   # (')  APOSTROPHE
> - remap the character U+0027 on input, to U+2019, before 
> processing by IDNA
> 002D  ; output          # (-)  HYPHEN-MINUS
> - allow character U+002D in output (the result of IDNA)
> 00A1  ; prohibited ; not in XID+  # (¡)  INVERTED EXCLAMATION MARK
> - prohibit character U+00A1; a shorthand reason is in field 
> 3. In this case, "not in XID+" means that it doesn't follow 
> the Unicode identifier guidelines in UAX#31.
> The document and associated data files are 'live'; they may 
> be updated during the course of this review. We'd appreciate 
> it if you send the revision number of the file with your 
> comments. You will find it in the header, in the form 
> "$Revision: 1.2 $". (Most of the time having this won't 
> matter, but just in case...).
> The confusables.txt data file is still being worked on, and 
> is not yet ready for productive review. A separate note will 
> be sent when it is ready for review.
> ‎Mark
Received on Wednesday, 6 July 2005 12:04:36 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 20:28:03 UTC