W3C home > Mailing lists > Public > www-validator@w3.org > February 2003

Re: autodetecting character encoding

From: Bjoern Hoehrmann <derhoermi@gmx.net>
Date: Sat, 08 Feb 2003 14:58:01 +0100
To: Nick Kew <nick@webthing.com>
Cc: Darin McGrew <mcgrew@stanfordalumni.org>, www-validator@w3.org
Message-ID: <3ebb09c7.297806643@smtp.bjoern.hoehrmann.de>

* Nick Kew wrote:
>But there's no requirement on HTML documents to start with those four
>bytes: they can be preceded by whitespace or an SGML comment.  Neither
>does HTML have a BOM to deal with multibyte character encodings, which
>I think is the key feature in XML that enables autodetection.

The BOM is allowed (but not required) in HTML, see HTML 4.01/5.2.1.
Received on Saturday, 8 February 2003 08:57:35 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 22:58:32 UTC