RE: Auto-detect and encodings in HTML5 from M.T. Carrasco Benitez on 2009-06-01 (public-html@w3.org from June 2009)

From: M.T. Carrasco Benitez <mtcarrascob@yahoo.com>
Date: Mon, 1 Jun 2009 13:33:56 -0700 (PDT)
To: Anne van Kesteren <annevk@opera.com>, Chris Wilson <Chris.Wilson@microsoft.com>, Maciej Stachowiak <mjs@apple.com>, Larry Masinter <masinter@adobe.com>
Cc: Travis Leithead <Travis.Leithead@microsoft.com>, Erik van der Poel <erikv@google.com>, "public-html@w3.org" <public-html@w3.org>, "www-international@w3.org" <www-international@w3.org>, Richard Ishida <ishida@w3.org>, Ian Hickson <ian@hixie.ch>, Harley Rosnow <Harley.Rosnow@microsoft.com>
Message-ID: <28423.98140.qm@web32405.mail.mud.yahoo.com>

[Larry]
> I'm interested in reducing ambiguity and making web transactions more  reliable

+1

> I also would be opposed to making an incompatible change with actual current behavior.

+1

> Yes, supplying explicit charset is preferable

More: one should move toward making it mandatory in the HTTP header. Anything else should be deprecated, but be live in an imperfect world ... 

> New behavior: IF you see, say, <doctype html5> THEN  assume default charset is UTF8 rather than applying heuristics to guess charset. 

UTF8 should be the last option in a set of rules; e.g.,

 - Get if from the HTTP header
 - If not, get if from META
 - If not, ...
 - if not UTF8

Tomas

Received on Monday, 1 June 2009 20:34:34 UTC