- From: Jon Hanna <jon@hackcraft.net>
- Date: Mon, 8 Mar 2004 12:04:36 +0000
- To: Elliotte Rusty Harold <elharo@metalab.unc.edu>
- Cc: Tim Bray <tbray@textuality.com>, "www-tag@w3.org" <www-tag@w3.org>
Quoting Elliotte Rusty Harold <elharo@metalab.unc.edu>: > > At 3:08 PM -0800 3/5/04, Tim Bray wrote: > > > >This is controversial. I think in general this is reasonable, with > >the single exception of doing what XML did and blessing both UTF-8 > >and UTF-16. The problem with a single encoding is that it forces > >people to choose between being Java/C# friendly (UTF-16) and C/C++ > >friendly (UTF-8). Later on, you in fact seem to agree with this > >point. Furthermore it's trivially easy to distinguish between UTF-8 > >and UTF-16 if you specify a BOM. But I think that if I were > >defining the next CSS or equivalent I'd like to be able to say > >"UTF-8 or UTF-16" without feeling guilty. > > Speaking as a Java programmer, I do not find UTF-8 to be less Java > friendly than UTF-16. Both UTF-8 and UTF-16 need to be passed through > a Reader on input and a Writer on output for any sort of robustness > to apply. Which one I choose to use is almost never based on Java's > internal storage format for Strings. Similarly, speaking as a C++ programmer, I do not find UTF-16 to be less C++ friendly than UTF-8. However I agree with Tim's argument that allowing a choice of UTF-8 or UTF-16 to be made by an author or producing application (and hence mandating that the two be differentiated and handled by the consuming application) is a good practice and should be allowed by the charmod rules. -- Jon Hanna <http://www.hackcraft.net/> "…it has been truly said that hackers have even more words for equipment failures than Yiddish has for obnoxious people." - jargon.txt
Received on Monday, 8 March 2004 07:04:39 UTC