- From: Tim Bray <tbray@textuality.com>
- Date: Fri, 5 Mar 2004 15:08:32 -0800
- To: Tim Bray <tbray@textuality.com>
- Cc: www-tag@w3.org <www-tag@w3.org>
- Message-Id: <0606CBCE-6EFA-11D8-95ED-000A95A51C9E@textuality.com>
> http://www.w3.org/TR/2004/WD-charmod-20040225 > > I'm sending a bunch of corrections but almost all are editorial, or > minor errors of fact, and not worthy of the TAG's time. I really only > found one thing Now that I sent 'em off the feedback address, I changed my mind and think that there may be two more issues in here with architectural weight: ======================================================= C016 [S] When designing a new protocol, format or API, specifications SHOULD mandate a unique character encoding. This is controversial. I think in general this is reasonable, with the single exception of doing what XML did and blessing both UTF-8 and UTF-16. The problem with a single encoding is that it forces people to choose between being Java/C# friendly (UTF-16) and C/C++ friendly (UTF-8). Later on, you in fact seem to agree with this point. Furthermore it's trivially easy to distinguish between UTF-8 and UTF-16 if you specify a BOM. But I think that if I were defining the next CSS or equivalent I'd like to be able to say "UTF-8 or UTF-16" without feeling guilty. ======================================================== I don't see anywhere that it recommends that if you're using UTF-16 you always use a BOM, and that seems like a basic good practice, particularly if you're going to allow either UTF8 or UTF-16.
Attachments
- application/pkcs7-signature attachment: smime.p7s
Received on Friday, 5 March 2004 18:08:37 UTC