Re: Fwd: I-D ACTION:draft-goldsmith-utf7-01.txt

À 11:29 05-02-97 -0800, David Goldsmith a écrit :
>FYI
>
>If there are no objections I will try to advance this to RFC 
>(Experimental) status in two weeks, then register "UTF-7".

Good.  Some comments:

>Abstract
>
>...
>   This document describes a transformation format of Unicode that
>   contains only 7-bit ASCII characters and ...

This is misleading.  UTF-7 encodes UCS *characters* using only 7-bit
ASCII-valued *octets*.

>Overview
>
>   UTF-7 encodes Unicode characters as US-ASCII, together with...

Same remark.

>   UTF-7 should normally be used only in the context of 7 bit
>   transports, such as mail and news. In other contexts, straight
>   Unicode or UTF-8 is preferred.

Great!  Please remove "and news", however.  News are in effect 8-bit clean;
many newsgroups use 8-bits charsets routinely, and all widespread
implementations are 8-bit clean.  Even the IAB charset workshop report
(draft-weider-iab...) recognizes that.

>UTF-7 Definition
>
>   A UTF-7 stream represents 16-bit Unicode characters in 7-bit US-ASCII
>   as follows:

Sugg.: "represents ... using 7-bit ASCII-valued octets as follows"

>      Unicode is encoded using Modified Base64 by first converting
>      Unicode 16-bit quantities to an octet stream (with the most
>      significant octet first). Surrogate pairs (UTF-16) are converted
>      by treating each half of the pair as a separate 16 bit quantity
>      (i.e., no special treatment). Text with an odd number of octets is
>      ill-formed.

Since the draft refers to 10646 as well as Unicode, it might be worth
saying that UCS-4 characters outside of the range accessible through UTF-16
cannot be transformed by UTF-7.

>   2. Most non-European alphabet-based languages (e.g., Greek)...

The Greek will sure be surprised to learn that they are not Europeans :-)

Regards,


-- 
François Yergeau <yergeau@alis.com>
Alis Technologies Inc., Montréal
Tél : +1 (514) 747-2547
Fax : +1 (514) 747-2561

--Boundary (ID uEbHHWxWEwCKT9wM3evJ5w)

Received on Wednesday, 5 February 1997 18:52:43 UTC