W3C home > Mailing lists > Public > www-html@w3.org > June 2006

RE: Problem in publishing multilingual HTML document on web in UTF-8 encoding

From: Paul Nelson \(ATC\) <paulnel@winse.microsoft.com>
Date: Mon, 5 Jun 2006 02:45:33 -0700
Message-ID: <49C257E2C13F584790B2E302E021B6F9100C7F97@winse-msg-01.segroup.winse.corp.microsoft.com>
To: "L. David Baron" <dbaron@dbaron.org>, <www-html@w3.org>

What is your suggestion to enforce page authors write the correct
charset (if any) on their pages?

Paul 

-----Original Message-----
From: www-html-request@w3.org [mailto:www-html-request@w3.org] On Behalf
Of L. David Baron
Sent: Monday, June 05, 2006 2:04 PM
To: www-html@w3.org
Subject: Re: Problem in publishing multilingual HTML document on web in
UTF-8 encoding

On Thursday 2006-06-01 20:04 -0700, Paul Nelson (ATC) wrote:
> Second, I know that we have autodetection for codepage of a 
> document...just in case the user never set that in the page. The 
> autodetection has worked well for a number of years.

It might work well for a browser that has majority market share (so that
most authors test their pages in it) and that doesn't change very often.

It might not work so well if you ever want to change the algorithm.  For
example, detecting an encoding you didn't previously support might cause
a page that used to work to be detected as the newly supported encoding.

It also makes it harder for browsers to interoperate.   If the character
encoding autodetection rules that pages depend on are not documented and
freely implementable then it's much harder for others to implement them.

-David

-- 
L. David Baron                                <URL: http://dbaron.org/ >
           Technical Lead, Layout & CSS, Mozilla Corporation
Received on Monday, 5 June 2006 09:45:30 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 27 March 2012 18:16:06 GMT