RE: Revised proposal for UTF-16

At 08:36 AM 5/26/98 +0200, Harald Alvestrand wrote:
>Question: For what data element size do we expect the BOM to be used?
>For long pieces of text, it's pretty obvious.
>But what about databases? Structured values? ASN.1 SET OFs?
>On all strings, the first string (whatever that means) or no string?
>
>I'm not worried about wasting space, but about clarity on when to use it.

Won't messages coded in UTF-16 usually have a clear 
beginning, and be long enough that the extra two bytes
overhead is not a problem?
e.g. UTF-16 if used in, say, some future HTTP, would be
quite happy with this.
Maybe it'll be ok if we only worry about the issue when messages
coded in UTF-16 touch the Internet, and not worry about
database internals; presumably people writing non-Internet-
connected databases can keep their byte order straight
without the IETF's help.

I think either of two ways can get us the clarity we crave:
1. Mandate a certain byte order for UTF-16 messages that
hit the Internet.
2. Mandate a BOM at the start of each UTF-16 message
that hits the Internet.

#1 is probably bitter medicine for those on the
losing side.  (Please correct me if I'm wrong.)
#2 is probably palatable to all concerned.

Apologies to all if I'm out of line here.  
Actually, I tried to unsubscribe several times
a few years ago, and this exchange is my vengeance
upon the listserv for not letting me go :-)
- Dan

--Boundary (ID uEbHHWxWEwCKT9wM3evJ5w)

Received on Tuesday, 26 May 1998 00:14:53 UTC