- From: Leif Halvard Silli <xn--mlform-iua@xn--mlform-iua.no>
- Date: Thu, 11 Feb 2010 10:53:14 +0100
- To: CE Whitehead <cewcathar@hotmail.com>
- Cc: ishida@w3.org, www-international@w3.org
CE Whitehead, Wed, 10 Feb 2010 17:20:04 -0500: > Also regarding the notepad BOM, is there anyway to get that thing out > with an escape sequence, has anyone discovered that-- > or maybe I could take it out by re-editing the file in word at the > very end??? > and then saving as a utf-8 text file?? The NCR for BOM is ''. One thing is whether it would work. Probably not, because when you use NCRs then you don't indicate any encoding. But anyhow: if you try to validate such a document, then you will see that it is not valid to type '' (or any other NCR) before the !DOCTYPE declaration. > OUT of CURIOSITY > > Can one declare all character sets used in a document in the http header? Did you mean "any" and not "all"? Did you mean "charset" (singular) and not "character sets"? A HTML file can only declare one encoding - referred to in HTML code and HTTP headers as "charset". When you use the META element to define the encoding/charset (or "encoding char(aracter )set", as I would call it), then you are in fact using HTTP vocabulary directly in HTML - note the term http-equiv: <meta http-equiv="Content-Type" content="text/html; charset=utf-8"/> So, yes, HTTP can declare any encoding charset that HTML documents could possibly have - which is only one per document. (Note that HTML5 proposes "<meta charset="utf-8">" as a less HTTP-ish way to define the encoding charset - see Richard's article ...) Richard, perhaps you should point out, if you haven't done so already, that a HTML/XML document only has one encoding. -- leif halvard silli
Received on Thursday, 11 February 2010 09:53:48 UTC