Re: Reviewed charmod fundamentals

>  http://www.w3.org/TR/2004/WD-charmod-20040225
>
> I'm sending a bunch of corrections but almost all are editorial, or 
> minor errors of fact, and not worthy of the TAG's time.  I really only 
> found one thing

Now that I sent 'em off the feedback address, I changed my mind and 
think that there may be two more issues in here with architectural 
weight:

=======================================================

C016   [S]   When  designing a new protocol, format or API, 
specifications  SHOULD mandate a unique character encoding.

This is controversial.  I think in general this is reasonable, with the 
single exception of doing what XML did and blessing both UTF-8 and 
UTF-16.  The problem with a single encoding is that it forces people to 
choose between being Java/C# friendly (UTF-16) and C/C++ friendly 
(UTF-8).  Later on, you in fact seem to agree with this point.  
Furthermore it's trivially easy to distinguish between UTF-8 and UTF-16 
if you specify a BOM.  But I think that if I were defining the next CSS 
or equivalent I'd like to be able to say "UTF-8 or UTF-16" without 
feeling guilty.

========================================================

I don't see anywhere that it recommends that if you're using UTF-16 you 
always use a BOM, and that seems like a basic good practice, 
particularly if you're going to allow either UTF8 or UTF-16.

Received on Friday, 5 March 2004 18:08:37 UTC