Re: draft-hoffman-utf16-01.txt available

I see good reasons for having the BOM with charset="UTF-16".
I see no reason for having a BOM with charset="UTF-16BE" or
charset="UTF-16LE".

I think if we have all three labels and for each of them more
or less have "use a BOM or not as you like", we have the same
mess as before, just with more labels.

I think there are people who believe in the BOM, and others
that think it's a bad idea. My guess is that it's very difficult
to change that. But I think what we can do is to try and make
clear from the sender to the receiver what the position of the
sender was. Basically, then, BOM-lovers would use charset="UTF-16",
and BOM-haters would use charset="UTF-16BE" or charset="UTF-16LE".
We would have several different things, but we would know which is
which.

We wouldn't have to change XML, only to add a clarification to
say that "UTF-16" in the XML spec means only the case
charset="UTF-16", and not the others.

What do you think about it?   Regards,   Martin.


At 19:40 99/02/01 -0800, Paul Hoffman / IMC wrote:
> At 11:39 AM 2/2/99 +0900, MURATA Makoto wrote:
> >I have a question.  I know that many people would like to make the 
> >BOM optional.  But what is the reason for making it optional?  
> >If we can say that the BOM is mandatory and is merely an artifact for 
> >encoding, this RFC becomes much simpler.
> 
> But it's pretty clear that not everyone would follow it because some UTF-16
> editing software does BOMs, other doesn't. So instead of making it
> mandatory and having many creators ignore it, it seems better to deal with
> the reality of today. I don't think that the wording is all that confusing.
> 
> --Paul Hoffman, Director
> --Internet Mail Consortium
> 
> 


#-#-#  Martin J. Du"rst, World Wide Web Consortium
#-#-#  mailto:duerst@w3.org   http://www.w3.org

Received on Tuesday, 2 February 1999 11:54:42 UTC