RE: New draft-yergeau-rfc2279bis-05.txt

Hi Markus,

I agree that UTF-8 should be defined based on Unicode/4.0, but...

Stringprep (RFC 3454) states that it is _only_ valid for use with
Unicode/3.2.  Which means that IETF protocols now writing Stringprep
profiles are _only_ valid for use with Unicode/3.2.  RFC 3454 actually
says it must be revised before Stringprep can be used with a later
version of Unicode.  New base tables (not just profiles) must be
published.

Restricting IETF protocols to use of Unicode/3.2 is not a desirable
outcome of the IETF's wide support for the Stringprep approach.

Cheers,
- Ira McDonald,
  High North Inc


-----Original Message-----
From: Markus Scherer [mailto:markus.scherer@jtcsv.com]
Sent: Tuesday, June 10, 2003 1:21 PM
To: charsets
Subject: Re: New draft-yergeau-rfc2279bis-05.txt


McDonald, Ira wrote:
> Which reminds me that the recently published RFC 3454 (December 2002)
> is based on Unicode/3.2 (of course).  But there are (I believe) some
> new characters registered in Unicode/4.0.  Also, Markus Kuhn's good
> point recently on Linux I18N list that the character class of 
> SOFT-HYPHEN just changed in Unicode/4.0 (which affects Stringprep).

None of this affects the definition of UTF-8. The reference to Unicode 4 is
for the definition of 
the character encoding scheme and related definitions. Unicode 4 is useful
because 1. it will be a 
book soon and 2. its description of all of the core UTFs is much clearer and
explicit than before.

> Since a lot of IETF WGs are doing Stringprep profiles, it would be
> desirable that they were referencing Unicode/4.0 - thus new exclusions
> tables are needed, for example.

Only for new profiles, right?

markus

-- 
Opinions expressed here may not reflect my company's positions unless
otherwise noted.

Received on Tuesday, 10 June 2003 14:59:23 UTC