Re: Revised proposal for UTF-16 from Harald Alvestrand on 1998-07-24 (ietf-charsets@w3.org from July to September 1998)

From: Harald Alvestrand <Harald.Alvestrand@maxware.no>
Date: Fri, 24 Jul 1998 11:15:38 +0200
To: "Martin J. Duerst" <duerst@w3.org>, erik@netscape.com
Cc: Dan Kegel <dank@alumni.caltech.edu>, MURATA Makoto <murata@apsdc.ksp.fujixerox.co.jp>, Chris Newman <Chris.Newman@INNOSOFT.COM>, ietf-charsets@ISI.EDU, murata@fxis.fujixerox.co.jp, Tatsuo_Kobayashi@justsystem.co.jp
Message-id: <3.0.2.32.19980724111538.021012d0@127.0.0.1>

At 14:51 24.07.98 +0900, Martin J. Duerst wrote:
>What I think we should worry is whether and how UTF-16 should be used
>in traditional protocol headers, based on MIME encoded words. Several
>solutions are possible:
>
>- Discourage or disallow UTF-16 in such headers (there are other
>  cases, in particular Korean Email, where there are differences
>  between the encoding used in the header and in the body).
>

This is reasonable.

>- Use a different specification for these headers (headers would
>  probably be in big-endian without a BOM, and nothing else,
>  bodies could tolerate little-endian and/or recommend/mandate
>  the BOM). The difference is justified because headers need
>  additional encoding/decoding anyway, and the user expectations
>  for their legibility are somewhat lower.

This means that there are 2 almost-equal character sets.
Since they're not completely equal, they have to have different names.
That this seems attractive is an example of why I don't think mandating
the BOM is likely to be a Good Idea for all cases of UTF-16.

>- Use exactly the same specifications for both headers and bodies.

This is reasonable.

               Harald A

-- 
Harald Tveit Alvestrand, Maxware, Norway
Harald.Alvestrand@maxware.no


--Boundary (ID uEbHHWxWEwCKT9wM3evJ5w)

Received on Friday, 24 July 1998 02:19:44 UTC