- From: Leif Halvard Silli <xn--mlform-iua@xn--mlform-iua.no>
- Date: Tue, 7 Jun 2011 16:40:05 +0200
- To: John Cowan <cowan@mercury.ccil.org>
- Cc: Bjoern Hoehrmann <derhoermi@gmx.net>, www-international <www-international@w3.org>
John Cowan, Tue, 7 Jun 2011 01:52:54 -0400: > Bjoern Hoehrmann scripsit: > >> Anyone who wants the BOM to take precedence over the HTTP Content-Type >> header, or the charset parameter within it, is welcome to make an I-D >> to that effect that updates RFC 2616 and RFC 4288 and possibly others. >> Trying to sneak in such changes through backdoors is unacceptable. So, >> if "HTML5" has rules as you suggest, that is most likely an error. > > I fully expect that by 2017 HTML5 will have defined its own version of > Unicode, its own version of MIME, its own version of HTTP, and its own > version of TCP/IP. Compatibility with anything else will no longer be > an issue. You exegesis of what XML 1.0 says on second guessing (note the fragment URI) when there is external encoding info, would be very welcome: [1] ]] F.2 Priorities in the Presence of External Encoding Information The second possible case occurs when the XML entity is accompanied by encoding information, as in some file systems and some network protocols. When multiple sources of information are available, their relative priority and the preferred method of handling conflict should be specified as part of the higher-level protocol used to deliver XML. In particular, please refer to [IETF RFC 3023] or its successor, which defines the text/xml and application/xml MIME types and provides some useful guidance. In the interests of interoperability, however, the following rule is recommended. * If an XML entity is in a file, the Byte-Order Mark and encoding declaration are used (if present) to determine the character encoding. [[ [1] http://www.w3.org/TR/xml/#sec-guessing-with-ext-info -- Leif Halvard Silli
Received on Tuesday, 7 June 2011 14:40:46 UTC