W3C home > Mailing lists > Public > www-validator@w3.org > March 2017

Re: Default charset in HTML5

From: Nick <halbtaxabo-temp4@yahoo.com>
Date: Thu, 9 Mar 2017 13:46:52 +0000 (UTC)
To: "Michael[tm] Smith" <mike@w3.org>, Nick <halbtaxabo-temp4@yahoo.com>
Cc: "www-validator@w3.org" <www-validator@w3.org>
Message-ID: <53672439.2120582.1489067212112@mail.yahoo.com>
>Michael[tm] Smith <mike@w3.org>:

>Nick <halbtaxabo-temp4@yahoo.com>, 2017-03-09 09:35 +0000:

>> Archived-At: <http://www.w3.org/mid/399303649.2033694.1489052134241@mail..yahoo.com>
>> 
>> If an HTML5 document doesn't specify a charset, the validator flags an error like this:
>> 
>> "Error: The character encoding was not declared. Proceeding using windows-1252"
>> 
>> and then proceeds to flag further errors  ("Unmappable byte sequence")
>> when it encounters utf-8 encodings of characters not in the windows-1252
>> set. Isn't utf-8 the default for HTML5?

>It’s not the default if by that you mean you don’t need to declare it.

>Per the Encoding spec, conforming documents are required to both use UTF-8
>as their encoding and also are required to explicitly specify UTF-8 as the
>encoding—either using a Content-Type header or a <meta> element.

>So it’s non-conforming for a document to not declare an encoding, but
>browsers are still required to process documents that don’t declare one.

>And for legacy backward-compat, if a document doesn’t declare an encoding,
>then browsers are required to parse it using windows-1252 as the encoding.

  —Mike

-- 

Michael[tm] Smith https://sideshowbarker.net/
"if a document doesn’t declare an encoding,
then browsers are required to parse it using windows-1252"

Really? Which current standards document says that?

Nick
Received on Thursday, 9 March 2017 13:48:20 UTC

This archive was generated by hypermail 2.3.1 : Thursday, 9 March 2017 13:48:23 UTC