W3C home > Mailing lists > Public > public-html@w3.org > February 2008

RE: several messages about handling encodings in HTML

From: Brian Smith <brian@briansmith.org>
Date: Fri, 29 Feb 2008 07:00:54 -0800
To: "'HTML WG'" <public-html@w3.org>
Message-ID: <004101c87ae3$e25e60e0$6401a8c0@T60>

> Section "8.2 Parsing HTML documents" is indeed exclusively 
> for user agent implementors and conformance checker
> implementors. For authors and authoring tool implementors,
> you want section "8.1 Writing HTML documents" and section
> "3.7.5.4. Specifying the document's character encoding"
> (which is linked to from 8.1). These give the flipside of
> these requirements, the authoring side.

* Section 8.1 says that any document may start with a BOM. However, some encodings do not allow a BOM at the beginning (UTF-16BE/UTF-16LE). And, obviously, some encodings cannot encode the BOM. The statement should be changed to say that the BOM is only allowed if the encoding allows it.

* 3.7.5.4 (The META element) is the not correct place to define encoding requirements for authors. It is counter-intuitive to have to look in the definition of the META element to find out that you can use the BOM or the Content-Type header to specify the encoding. The encoding requirements should be in section 8, and it should be emphasized that the encoding should be given in the Content-Type ("transport layer") whenever possible. The fact that the encoding is determined based on Content-Type, then the BOM, the XML declaration, then <META> is relavant for content authors as well as parser implementers.

- Brian
Received on Friday, 29 February 2008 15:01:11 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Wednesday, 9 May 2012 00:16:12 GMT