W3C home > Mailing lists > Public > ietf-charsets@w3.org > April to June 1999

Re: Fwd: I-D ACTION:draft-hoffman-utf16-03.txt

From: MURATA Makoto <murata@apsdc.ksp.fujixerox.co.jp>
Date: Thu, 06 May 1999 16:47:35 +0900
To: ietf-charsets@iana.org
Message-id: <199905060747.AA00397@archlute.apsdc.ksp.fujixerox.co.jp>
Paul Hoffman / IMC wrote:
> However, please do read the draft and 
> let us know if you think it is ready (or almost ready) to be sent off for 
> RFChood.

I think it is almost ready, but I have a nit.

>4.3 Interpreting text labelled as UTF-16
>
>Text labelled with the "UTF-16" charset might be serialized in either
>big-endian or little-endian order. If the first two octets of the text
>is 0xFE followed by 0xFF, then the text can be interpreted as being
>big-endian. If the first two octets of the text is 0xFF followed by
>0xFE, then the text can be interpreted as being little-endian. ...

I think that leading 0xFE 0xFF or 0xFF 0xFE in this case (charset = "utf-16") is 
always a byte order mark and is not a zero-width non-break space.  I would like 
to make this explicit, since "the character 0xFEFF in the first
position of a stream MAY be interpreted as a zero-width non-breaking
space, and is not always a byte-order mark." (in 3.2).

Cheers,

Makoto
 
Fuji Xerox Information Systems
 
Tel: +81-44-812-7230   Fax: +81-44-812-7231
E-mail: murata@apsdc.ksp.fujixerox.co.jp
Received on Thursday, 6 May 1999 03:49:06 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Monday, 5 June 2006 15:10:51 GMT