Re: Should the UTF-8 BOM trump overriding via HTTP or by users? from Leif Halvard Silli on 2011-06-09 (www-international@w3.org from April to June 2011)

From: Leif Halvard Silli <xn--mlform-iua@xn--mlform-iua.no>
Date: Thu, 9 Jun 2011 13:36:34 +0200
To: John Cowan <cowan@mercury.ccil.org>
Cc: Bjoern Hoehrmann <derhoermi@gmx.net>, www-international <www-international@w3.org>
Message-ID: <20110609133634955663.8edb6d16@xn--mlform-iua.no>

Leif Halvard Silli, Wed, 8 Jun 2011 04:47:52 +0200:
> John Cowan, Tue, 7 Jun 2011 13:41:56 -0400:
>> Leif Halvard Silli scripsit:
> 
>>> ]]
>>> In the interests of interoperability, however, the following rule is 
>>> recommended.
>>>  * If an XML entity is in a file, the Byte-Order Mark and encoding 
>>> declaration are used (if present) to determine the character encoding.
>>> [[
> 
>> Did you paste the wrong quotation?  That explicitly refers to XML entities
>> in files; i.e. without HTTP metadata.
> 
> The quote appears under the heading "F.2 Priorities in the Presence of 
> External Encoding Information". Perhaps section '2.11 End-of-Line 
> Handling' gives a hint, it says: "XML parsed entities are often stored 
> in computer files […]". Because, when a parsed file is stored, it has 
> to include encoding info, which this section suggest to reuse.

* To make it 100% clear: I do believe the above quote speaks about a 
served file. This intepretation is in fact supported by the discusion 
of Appendix F in RFC3023: http://tools.ietf.org/html/rfc3023#section-3.2


* I have updated, and summarized, the findings so far in the bug: 
http://www.w3.org/Bugs/Public/show_bug.cgi?id=12897#c10

-- 
Leif Halvard Silli

Received on Thursday, 9 June 2011 11:37:04 UTC