W3C home > Mailing lists > Public > www-international@w3.org > January to March 2010

RE: For review: Character encodings in HTML and CSS

From: Leif Halvard Silli <xn--mlform-iua@xn--mlform-iua.no>
Date: Thu, 11 Feb 2010 10:53:14 +0100
To: CE Whitehead <cewcathar@hotmail.com>
Cc: ishida@w3.org, www-international@w3.org
Message-ID: <20100211105314214595.58ed6e5d@xn--mlform-iua.no>
CE Whitehead, Wed, 10 Feb 2010 17:20:04 -0500:
> Also regarding the notepad BOM, is there anyway to get that thing out 
> with an escape sequence, has anyone discovered that--
> or maybe I could take it out by re-editing the file in word at the 
> very end???
> and then saving as a utf-8 text file??

The NCR for BOM is '&#xfeff;'. One thing is whether it would work. 
Probably not, because when you use NCRs then you don't indicate any 
encoding.  But anyhow: if you try to validate such a document, then you 
will see that it is not valid to type '&#xfeff;' (or any other NCR) 
before the !DOCTYPE declaration. 

> Can one declare all character sets used in a document in the http header?

Did you mean "any" and not "all"? Did you mean "charset" (singular) and 
not "character sets"? 

A HTML file can only declare one encoding - referred to in HTML code 
and HTTP headers as "charset".  When you use the META element to define 
the encoding/charset (or "encoding char(aracter )set", as I would call 
it), then you are in fact using HTTP vocabulary directly in HTML - note 
the term http-equiv:

<meta http-equiv="Content-Type" content="text/html; charset=utf-8"/> 

So, yes, HTTP can declare any encoding charset that HTML documents 
could possibly have - which is only one per document. (Note that HTML5 
proposes "<meta charset="utf-8">" as a less HTTP-ish way to define the 
encoding charset - see Richard's article ...)

Richard, perhaps you should point out, if you haven't done so already, 
that a HTML/XML document only has one encoding.
leif halvard silli
Received on Thursday, 11 February 2010 09:53:48 UTC

This archive was generated by hypermail 2.3.1 : Wednesday, 21 September 2016 22:37:31 UTC