W3C home > Mailing lists > Public > www-international@w3.org > October to December 2012

Re: Feedback about the BOM article

From: Leif Halvard Silli <xn--mlform-iua@xn--mlform-iua.no>
Date: Tue, 11 Dec 2012 05:52:47 +0100
To: www-international@w3.org
Cc: Henri Sivonen <hsivonen@iki.fi>
Message-id: <20121211055247257452.2df7c26f@xn--mlform-iua.no>
Henri made me think that the article should also discus default 
behaviour/redundancy. See below.

Henri Sivonen, Mon, 10 Dec 2012 18:16:11 +0200:
> “However, bear in mind that it is always a good idea to declare the
> encoding of your page using the meta element, in addition to the BOM,
> so that the encoding is apparent to people visually inspecting the
> file. ”
> 
> I disagree: Either the <meta> declaration is redundant or it is wrong
> and misleads a person who is inspecting the file.

So, FIRSTLY, I believe Henri here was saying that the BOM article 
should change the *justification* for recommending the <meta>: The 
inspection purposes is a bad reason (it tastes of pedantry and 
"pedagogics"). Whereas redundancy is a real issue, especially in an 
article that speaks so much about deleting the BOM: When deleted, then 
HTML parsers are permitted to fallback to a non-UTF-8 legacy encoding. 
The inspection moment could be mentioned, but only as a secondary 
reason.

Wikipedia about redundancy: [1]

   ]] In engineering, redundancy is the duplication of critical
      components or functions of a system with the intention of
      increasing reliability of the system, usually in the case
      of a backup or fail-safe. [[

SECONDLY, in this context, the article should also explain that, as for 
XML, then the format itself is redundant, since, XML parsers default to 
UTF-8: "In the absence of [other] information […] it is a fatal error 
[…] for an entity which begins with neither a Byte Order Mark nor an 
encoding declaration to use an encoding other than UTF-8" [2]

SUMMARY: I propose that the article recommends the META element as a 
redundancy measure for HTML documents as well as for XHTML documents in 
case they get read as HTML. Inspection might be mentioned, but only as 
a secondary reason. And UTF-8 as the XML files’s encoding by default 
should be mentioned.

[1] http://en.wikipedia.org/wiki/Redundancy_(engineering)
[2] http://www.w3.org/TR/REC-xml/#charencoding

-- 
leif halvard silli
Received on Tuesday, 11 December 2012 05:53:41 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 11 December 2012 05:53:42 GMT