Re: Reviewed charmod fundamentals

Quoting Elliotte Rusty Harold <>:

> At 3:08 PM -0800 3/5/04, Tim Bray wrote:
> >This is controversial.  I think in general this is reasonable, with 
> >the single exception of doing what XML did and blessing both UTF-8 
> >and UTF-16.  The problem with a single encoding is that it forces 
> >people to choose between being Java/C# friendly (UTF-16) and C/C++ 
> >friendly (UTF-8).  Later on, you in fact seem to agree with this 
> >point.  Furthermore it's trivially easy to distinguish between UTF-8 
> >and UTF-16 if you specify a BOM.  But I think that if I were 
> >defining the next CSS or equivalent I'd like to be able to say 
> >"UTF-8 or UTF-16" without feeling guilty.
> Speaking as a Java programmer, I do not find UTF-8 to be less Java 
> friendly than UTF-16. Both UTF-8 and UTF-16 need to be passed through 
> a Reader on input and a Writer on output for any sort of robustness 
> to apply.  Which one I choose to use is almost never based on Java's 
> internal storage format for Strings.

Similarly, speaking as a C++ programmer, I do not find UTF-16 to be less C++
friendly than UTF-8.

However I agree with Tim's argument that allowing a choice of UTF-8 or UTF-16 to
be made by an author or producing application (and hence mandating that the two
be differentiated and handled by the consuming application) is a good practice
and should be allowed by the charmod rules.

Jon Hanna
"…it has been truly said that hackers have even more words for
equipment failures than Yiddish has for obnoxious people." - jargon.txt

Received on Monday, 8 March 2004 07:04:39 UTC