Re: Auto-detect and encodings in HTML5

On Mon, 01 Jun 2009 19:44:23 +0200, Larry Masinter <masinter@adobe.com>  
wrote:
> Chris, in your note below you claim that the "current de facto" value  
> was "Win1252" which seems to contradict what I thought was claimed in  
> another message that the "de facto" default was "unknown" (which was my  
> understanding, i.e., that browsers used a wide variety of heuristics to  
> determine charset).

If the heuristics fail the final fallback is typically windows-1252. See  
also the section "Determining the character encoding" in HTML5.


> I'm interested in reducing ambiguity and making web transactions more  
> reliable, and associating a new version indicator (DOCTYPE) with a more  
> constrained default (charset default UTF8, rather than 'unknown') is  
> reasonable, while I also would be opposed to making an incompatible  
> change with actual current behavior.

Isn't that contradictory?

If people want a better encoding, why can't they simply specify it along  
with the DOCTYPE? Or specifity it at the HTTP level? Letting the DOCTYPE  
have more side effects than it already has seems harmful.


-- 
Anne van Kesteren
http://annevankesteren.nl/

Received on Monday, 1 June 2009 18:14:35 UTC