W3C home > Mailing lists > Public > www-validator@w3.org > October 2000

Re: UTF-8 with Unicode line separator and BOM

From: Terje Bless <link@tss.no>
Date: Wed, 18 Oct 2000 22:39:15 +0200
To: W3C Validator <www-validator@w3.org>
cc: Masayasu Ishikawa <mimasa@w3.org>
Message-ID: <20001018224510-r01010600-9ed16454@>
On 18.10.00 at 02:47, Masayasu Ishikawa <mimasa@w3.org> wrote:

>Terje Bless <link@tss.no> wrote:
>>>That's why the validator correctly reports errors (apart from BOM).
>>So there isn't any reason that it should be barfing on the BOM?
>Actually this is a "crack" between the First Edition (REC-xml-19980210)
>and the Second Edition (REC-xml-20001006) of XML 1.0, IMHO.
>There was no mention of the BOM in UTF-8 [in REC-xml-19980210]. Appendix F
>of REC-xml-20001006, however, does mention the case when the BOM is used
>in UTF-8.  Appendix F was completely rewritten in REC-xml-20001006. So
>[...] the validator should not be barfing on the BOM.

IOW: SP has not yet been updated to recognize the BOM as it was only really
standardized, lesse, two weeks ago. And since this is still version 1.0 of
XML it's impossible to tell if the document is written for "XML 1.0 First
Edition" or "XML 1.0 Second Edition" so you have to try sniffing for the
BOM for all XML 1.0 documents and -- until SP is updated (if it's ever
updated) -- manually supress the error?

As a cat owner, I know this for a fact...
Nothing says "I love you" like a decapitated gopher on your front porch.
Received on Wednesday, 18 October 2000 16:45:16 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 1 March 2016 14:17:28 UTC