W3C home > Mailing lists > Public > www-international@w3.org > April to June 2011

Re: Should the UTF-8 BOM trump overriding via HTTP or by users?

From: Leif Halvard Silli <xn--mlform-iua@xn--mlform-iua.no>
Date: Thu, 9 Jun 2011 13:36:34 +0200
To: John Cowan <cowan@mercury.ccil.org>
Cc: Bjoern Hoehrmann <derhoermi@gmx.net>, www-international <www-international@w3.org>
Message-ID: <20110609133634955663.8edb6d16@xn--mlform-iua.no>
Leif Halvard Silli, Wed, 8 Jun 2011 04:47:52 +0200:
> John Cowan, Tue, 7 Jun 2011 13:41:56 -0400:
>> Leif Halvard Silli scripsit:
>>> ]]
>>> In the interests of interoperability, however, the following rule is 
>>> recommended.
>>> 	*	If an XML entity is in a file, the Byte-Order Mark and encoding 
>>> declaration are used (if present) to determine the character encoding.
>>> [[
>> Did you paste the wrong quotation?  That explicitly refers to XML entities
>> in files; i.e. without HTTP metadata.
> The quote appears under the heading "F.2 Priorities in the Presence of 
> External Encoding Information". Perhaps section '2.11 End-of-Line 
> Handling' gives a hint, it says: "XML parsed entities are often stored 
> in computer files […]". Because, when a parsed file is stored, it has 
> to include encoding info, which this section suggest to reuse.

* To make it 100% clear: I do believe the above quote speaks about a 
served file. This intepretation is in fact supported by the discusion 
of Appendix F in RFC3023: http://tools.ietf.org/html/rfc3023#section-3.2

* I have updated, and summarized, the findings so far in the bug: 

Leif Halvard Silli
Received on Thursday, 9 June 2011 11:37:04 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 22:40:59 UTC