Re: [HTML5] 2.8 Character encodings from Bil Corry on 2009-08-03 (public-html-comments@w3.org from August 2009)

From: Bil Corry <bil@corry.biz>
Date: Mon, 03 Aug 2009 14:16:51 -0500
To: "Dr. Olaf Hoffmann" <Dr.O.Hoffmann@gmx.de>
CC: public-html-comments@w3.org
Message-ID: <4A7737A3.2060708@corry.biz>

Dr. Olaf Hoffmann wrote on 8/1/2009 10:08 AM: 
> Bil Corry:
>> Dr. Olaf Hoffmann wrote on 7/31/2009 1:10 PM:
>>> With the still open questions I mainly try to find out whether it is
>>> possible to specify, that a 'HTML5' document has an encoding like
>>> 'ISO-8859-1' and not 'Windows-1252'.
>> You do it the same way as you would for any character set, by specifying
>> the content encoding as ISO-8859-1.  Typically this is done via the
>> Content-Type header:
>>
>>  Content-Type: text/html; charset=ISO-8859-1
>>
>> That header means, "This HTML document is in the ISO-8859-1 character set."
>>  By inference, it also means that it isn't Windows-1252, or UTF-8, etc.
> 
> This I know and is true for other formats but 'HTML5', the current draft
> of 'HTML5' has a specific rule, that this means 'Windows-1252' and
> not 'ISO-8859-1' - and this seems to supersede what the server indicates,
> if a viewer is able to identify is as a 'HTML5' document.

I started to reply, but realized this thread is just going circular.

At issue, you are claiming the HTML5 charset rules will create problems for authors -- can you provide some real-world examples?  I would be very interested to some of your documents where your ISO-8859-1 encoding is broken by the HTML5 charset rules.



- Bil

Received on Monday, 3 August 2009 19:17:46 UTC