Re: Registration of new charset "UTF-16"

Larry Masinter wrote:

> I sent out a poll to the HTTP working group: are there two independent
> interoperable implementations of the HTTP 'exception' that send and
> process text types that don't use CR, LF, or CRLF for end of line?
> If we can't find two independent interoperable implementations, we
> may have to remove the 'feature' before we can progress HTTP/1.1 to
> Draft Standard.

As others have said, the Netscape and Alis clients support UCS-2, and MSIE
supports it to some degree too. (At least as far as end-of-line issues are
concerned, which are relatively trivial.)

Netscape looks for the HTTP charset parameter, and recognizes the following
UCS-2-related charset names:

ISO-10646-UCS-2
csUnicode11
ISO-10646-UCS-BASIC
csUnicodeASCII
ISO-10646-Unicode-Latin1
csUnicodeLatin1
ISO-10646
ISO-10646-J-1

The first one is the "main" one. Do Alis and MS use these names too?

If there is no HTTP charset, we try to detect UCS-2 by looking for 0xFEFF and
0xFFFE (little-endian). An early implementation looked for zero bytes, but
this was unreliable since some people (Gopher, if I remember correctly)
actually use zero bytes in non-UCS-2 text.

It might be a good idea to do some more extensive UCS-2 interoperability
testing, including charset name testing, and end-of-line testing. Sounds like
Makoto has already done some testing.

Erik


--Boundary (ID uEbHHWxWEwCKT9wM3evJ5w)

Received on Tuesday, 19 May 1998 11:23:36 UTC