W3C home > Mailing lists > Public > www-international@w3.org > January to March 2010

RE: For review: Character encodings in HTML and CSS

From: Leif Halvard Silli <xn--mlform-iua@xn--mlform-iua.no>
Date: Fri, 12 Feb 2010 03:24:45 +0100
To: CE Whitehead <cewcathar@hotmail.com>
Cc: xn--mlform-iua@xn--mlform-iua.no, ishida@w3.org, www-international@w3.org
Message-ID: <20100212032445260985.c4cc6b45@xn--mlform-iua.no>
CE Whitehead, Thu, 11 Feb 2010 14:46:47 -0500:
>> Date: Thu, 11 Feb 2010 10:53:14 +0100

> However, according to
> http://www.w3.org/International/O-HTTP-charset
>  
> it's possible to set the header simultaneously for several documents
> which might have very different character encodings (I mean charsets here).

A a HTTP header belongs to one "serve" of a document, according to my 
understanding. But you can tell the server to use a default charset for 
all documents with the same suffix. I could not see any thing else 
about "several documents" on that page.

> Also
> 
> when I go to test my http header declarations, I get the following:
> 
> "Accept-Charset: ISO-8859-1,UTF-8;q=0.7,*;q=0.7[CRLF]"
> 
> that's two charsets, right?

No. That is just two charset *labels*. Secondly, accept-charset is a 
header which your web browser tells to the web server so it can serve 
you what your browser accepts/prefers.

> Elsewhere, I've seen it recommended elsewhere that I encode my 
> documents as ansi, and then just use the Latin-1 char set (ISO 
> 8859-1) with no escapes (assuming I can do this),

ISO-8859-1 apparently is synonymous with ANSI/Windows-1252 on the Web - 
but I don't know the details.

> and then declare my  encoding as utf-8 anyway. Will this work out o.k.? 

If your document actually only contains ASCII characters - or if all 
non-ASCII characters are escaped, then it should work. But I don't see 
why it should work if it has unescaped non-ASCII characters, unless 
there were a mislabeling going on ...

> It will certainly eliminate the BOM in my files!

BOM is not a requirement - for HTML documents. But I'm not into those 
details either.

> (I can give the source if you think  this is a practice to recommend.)

I recommend to use UTF-8 if you want to label it as UTF-8.
-- 
leif halvard silli
Received on Friday, 12 February 2010 02:25:19 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Friday, 12 February 2010 02:25:20 GMT