Re: Charset reviewer appointed from Harald Tveit Alvestrand on 1998-07-27 (ietf-charsets@w3.org from July to September 1998)

From: Harald Tveit Alvestrand <Harald.Alvestrand@maxware.no>
Date: Mon, 27 Jul 1998 13:42:00 +0200
To: "Martin J. Duerst" <duerst@w3.org>, unicore@unicode.org
Cc: Multiple Recipients of Unicore <unicore@unicode.org>, kenw@sybase.com, ietf-charsets@iana.org
Message-id: <3.0.2.32.19980727134200.015698c0@dokka.maxware.no>

At 08:51 25.07.98 +0900, Martin J. Duerst wrote:
>However, please note that XML already decided to make
>the BOM mandatory for UTF-16. I told them that that was
>not something they should define, but they didn't listen.
>
>There would be a "way out" by saying that in that case,
>the BOM is part of an "intermediate layer" (no, it's
>of course not part of XML, because it's not present
>in UTF-8 or other encodings), and not part of UTF-16
>as defined above. But such a "way out" is really clumsy.

The BOM is part of the charset that UTF-16 represents.
Any application can say anything it wants to *further restricting*
what characters can apply where; the part we couldn't tolerate
was if XML insisted upon strings that were *illegal* in the registered
UTF-16, yet calling the charset "UTF-16".

Ken Whistler wrote:

>With regards to Harald Alvestrand's summary of the open
>issues with respect to the UTF-16 registration, the only
>way I see forward, given the nature of the "charset"
>definition, is to split this request into two registrations:
>
>UTF-16   big-endian UTF-16
>UTF-16BS little-endian (byte-swapped) UTF-16

I see this as a reasonable thing to do.

                          Harald

-- 
Harald Tveit Alvestrand, Maxware, Norway
Harald.Alvestrand@maxware.no


--Boundary (ID uEbHHWxWEwCKT9wM3evJ5w)

Received on Monday, 27 July 1998 05:08:42 UTC