RE: For review: Character encodings in HTML and CSS

CE Whitehead, Thu, 11 Feb 2010 14:46:47 -0500:
>> Date: Thu, 11 Feb 2010 10:53:14 +0100

> However, according to
> http://www.w3.org/International/O-HTTP-charset
>  
> it's possible to set the header simultaneously for several documents
> which might have very different character encodings (I mean charsets here).

A a HTTP header belongs to one "serve" of a document, according to my 
understanding. But you can tell the server to use a default charset for 
all documents with the same suffix. I could not see any thing else 
about "several documents" on that page.

> Also
> 
> when I go to test my http header declarations, I get the following:
> 
> "Accept-Charset: ISO-8859-1,UTF-8;q=0.7,*;q=0.7[CRLF]"
> 
> that's two charsets, right?

No. That is just two charset *labels*. Secondly, accept-charset is a 
header which your web browser tells to the web server so it can serve 
you what your browser accepts/prefers.

> Elsewhere, I've seen it recommended elsewhere that I encode my 
> documents as ansi, and then just use the Latin-1 char set (ISO 
> 8859-1) with no escapes (assuming I can do this),

ISO-8859-1 apparently is synonymous with ANSI/Windows-1252 on the Web - 
but I don't know the details.

> and then declare my  encoding as utf-8 anyway. Will this work out o.k.? 

If your document actually only contains ASCII characters - or if all 
non-ASCII characters are escaped, then it should work. But I don't see 
why it should work if it has unescaped non-ASCII characters, unless 
there were a mislabeling going on ...

> It will certainly eliminate the BOM in my files!

BOM is not a requirement - for HTML documents. But I'm not into those 
details either.

> (I can give the source if you think  this is a practice to recommend.)

I recommend to use UTF-8 if you want to label it as UTF-8.
-- 
leif halvard silli

Received on Friday, 12 February 2010 02:25:19 UTC