W3C home > Mailing lists > Public > whatwg@whatwg.org > March 2006

[whatwg] Internal character encoding declaration

From: Henri Sivonen <hsivonen@iki.fi>
Date: Mon, 13 Mar 2006 16:43:12 +0200
Message-ID: <CE2ED34E-352F-432C-A328-859FCE6E349C@iki.fi>
On Mar 13, 2006, at 16:12, Lachlan Hunt wrote:

> Henri Sivonen wrote:
>> Authors are adviced not to use the UTF-32 encoding or legacy  
>> encodings. (Note: I think UTF-32 on the Web is harmful and utterly  
>> pointless,
>
> I agree about it being pointless, but why is it considered harmful?

Opportunity cost: The time that is spent implementing something  
pointless could be better spend doing something else--like  
implementing something useful.

Backwards incompatibility: Using UTF-32 instead of UTF-8 makes pages  
incompatible with older UAs for no good reason.

Size: UTF-32 takes more bytes to transfer than UTF-8--slow load, bad  
user experience.

>>  I'd like to have some text in the spec that justifies whining
>> about legacy encodings.
>
> What are your reasons for whining about legacy encodings and what  
> would you like the spec to say?

Using a legacy encoding that user agents are not guaranteed to  
support introduces incompatibility for no good reason. (I do not  
consider laziness or unwillingness to use UTF-8 good reasons.)

Even with well-supported legacy encodings form submission is problem.  
The same as incoming policy combined with an encoding that cannot  
encode all of Unicode leads to data loss.

I would like the spec to say that if the page has forms, using an  
encoding other than UTF-8 is trouble. And even for pages that don't  
have forms, using an encoding that is not known to be extremely well  
supported introduces incompatibility for no good reason.

-- 
Henri Sivonen
hsivonen at iki.fi
http://hsivonen.iki.fi/
Received on Monday, 13 March 2006 06:43:12 UTC

This archive was generated by hypermail 2.4.0 : Wednesday, 22 January 2020 16:58:45 UTC