- From: Harald Tveit Alvestrand <Harald.Alvestrand@maxware.no>
- Date: Mon, 27 Jul 1998 13:42:00 +0200
- To: "Martin J. Duerst" <duerst@w3.org>, unicore@unicode.org
- Cc: Multiple Recipients of Unicore <unicore@unicode.org>, kenw@sybase.com, ietf-charsets@iana.org
At 08:51 25.07.98 +0900, Martin J. Duerst wrote:
>However, please note that XML already decided to make
>the BOM mandatory for UTF-16. I told them that that was
>not something they should define, but they didn't listen.
>
>There would be a "way out" by saying that in that case,
>the BOM is part of an "intermediate layer" (no, it's
>of course not part of XML, because it's not present
>in UTF-8 or other encodings), and not part of UTF-16
>as defined above. But such a "way out" is really clumsy.
The BOM is part of the charset that UTF-16 represents.
Any application can say anything it wants to *further restricting*
what characters can apply where; the part we couldn't tolerate
was if XML insisted upon strings that were *illegal* in the registered
UTF-16, yet calling the charset "UTF-16".
Ken Whistler wrote:
>With regards to Harald Alvestrand's summary of the open
>issues with respect to the UTF-16 registration, the only
>way I see forward, given the nature of the "charset"
>definition, is to split this request into two registrations:
>
>UTF-16 big-endian UTF-16
>UTF-16BS little-endian (byte-swapped) UTF-16
I see this as a reasonable thing to do.
Harald
--
Harald Tveit Alvestrand, Maxware, Norway
Harald.Alvestrand@maxware.no
--Boundary (ID uEbHHWxWEwCKT9wM3evJ5w)
Received on Monday, 27 July 1998 05:08:42 UTC