W3C home > Mailing lists > Public > www-validator@w3.org > March 2001

RE: Character set question

From: Liam Quinn <liam@htmlhelp.com>
Date: Wed, 7 Mar 2001 16:26:53 -0500 (EST)
To: Thanasis Kinias <tkinias@asu.edu>
cc: <www-validator@w3.org>
Message-ID: <Pine.LNX.4.30.0103071611230.1146-100000@localhost.localdomain>
On Wed, 7 Mar 2001, Thanasis Kinias wrote:

> If one is only using ASCII characters and the server is sending a charset
> value in the header Content-Type field (whether it's sending UTF-8, Latin-1,
> or Windows 1252), all is OK vis-à-vis the standards - unless I'm really
> misunderstanding "may" in the recommendation.

No, you're not misunderstanding the recommendation.

> At any rate, there isn't a compelling reason _not_ to specify with a <meta>.

It's not too severe of a problem, but the "Netscape charset burp" [1] is
enough reason for me to avoid specifying the charset with a <meta> tag, as
long as I can specify the charset in the HTTP header.

[1] http://ppewww.ph.gla.ac.uk/%7Eflavell/charset/ns-burp.html

> Liam also wrote (in response to Bertilo):
> > But it will cause links containing "#" to fail in IE4 for Windows.  So
> > ISO-8859-1 is still preferred when you don't need characters outside
> > ISO-8859-1.
> That's _bizarre_, but I guess not altogether surprising.  That answers the
> question I guess.  Is that also a problem with XHTML docs with implicit
> (default) UTF-8 encoding?

I can't say as I haven't tested this.

> On this subject, must one then specify a charset with XHTML docs served as
> text/html, even if it is the default UTF-8?

According to the standard, I would say no since XHTML is XML.  But if
you're serving your XHTML as text/html, than I assume you're concerned
about HTML compatibility, in which case you'd want to specify the charset
no matter what it is.  (Appendix C of the XHTML 1.0 Recommendation
addresses this, but it's not normative.)

Liam Quinn
Received on Wednesday, 7 March 2001 16:26:21 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 22:58:20 UTC