Re: Auto-detect and encodings in HTML5

Larry Masinter On 09-06-01 00.45:
> Changing the default charset from *something
> well known* to *something else* would be a bad
> idea -- that would be "default charset switching".
> But changing the charset from "known, please guess"
> to "UTF-8" doesn't seem like it is "default
> charset switching", it's "default charset 
> setting".


> Setting default charset setting may not be
> a good reason for a version indicator, but
> it's a supporting reason.


> If there were other reasons for having a version
> indicator (e.g., to support authoring requirements),
> the version indicator could also indicate default
> charset UTF8.


Maciej Stachowiak Sunday, May 31, 2009 3:35 PM

>> I think it would be pretty poor if some indicator of the document  
>> version (e.g. the doctype or as suggested by someone else a version  
>> parameter in the Content-Type header) changed the default charset.  
>> There are two reasons I say this:
>> 1) It goes against our desire to allow for gradual adoption. If  
>> changing your doctype declaration could have the side effect of  
>> changing your charset from Windows-1252 ("Windows Latin-1") to UTF-8,  
>> that would be a serious risk of breaking upgraded documents.

How so? Wouldn't this rather /encourage/ gradual adoption by 
attracting authors to it? One would probably find that authors 
would switch doctype even though they did not otherwise rework 
their pages /only/ to get this effect. Why would a Windows Latin-1 
document be switched to HTML 5 doctype if there otherwise were no 
effect in doing so? In fact, this change could prevent changes 
purely based on being "cool".

The HTML 5 doctype saves authors from typing. This effect would 
save many of them from typing the charset as well.

Such a change would also be very much in line with the "support 
world languages" principle. [1]

>> 2) Doctype and Content-type parameter are both opt-in mechanisms. But  
>> there's already explicit ways to opt in to UTF-8: the charset  
>> parameter on Content-type, or a <meta> tag in the document. Explicit  
>> opt-in seems better to me than implicit, since it's more likely the  
>> author will be making a change intentionally. [...]

What do you mean by saying that DOCTYPE is an opt-in? The draft 
says that "A DOCTYPE is a mostly useless, but required, header."


leif halvard silli

Received on Monday, 1 June 2009 01:47:04 UTC