W3C home > Mailing lists > Public > www-validator@w3.org > May 2008

Re: Fallback to UTF-8

From: Nikita The Spider The Spider <nikitathespider@gmail.com>
Date: Thu, 1 May 2008 13:56:48 -0400
Message-ID: <35e76ac10805011056o6be6633dtc7eed6fdde5c817e@mail.gmail.com>
To: "W3C Validator Community" <www-validator@w3.org>

On Sun, Apr 27, 2008 at 9:43 PM, olivier Thereaux <ot@w3.org> wrote:
> http://www.ietf.org/rfc/rfc2854.txt
> " Section 3.7.1, defines that "media subtypes of the 'text' type are
> defined to have a default charset value of 'ISO-8859-1'"."
> (ditto RFC 2616)
> This is the inconsistency at the core of the issue, isn't it.

I agree, and I'm surprised that this topic hasn't received more
attention in this debate, because it seems like a source for a
definitive (if unpopular) choice for a default encoding when one isn't
specified. Jukka asked, "...when validating something assumed to be
HTML, shouldn't HTML specs trump any other spec?" Jukka, I suppose you
meant this rhetorically, but it isn't clear to me that the answer is
yes. If the answer is no, HTML should not be able to trump HTTP, then
as I see it the validator has no choice other than to obey the HTTP
spec (for "text/" documents, anyway). If the answer is yes, well, then
one has the current debate.

It's an interesting data point that RFC 2376 "XML Media Types" also
feels that it can override the HTTP spec. It says, "This example shows
text/xml with the charset parameter omitted. In this case, MIME and
XML processors must assume the charset is "us-ascii", the default
charset value for text media types specified in [RFC-2046]. The
default of "us-ascii" holds even if the text/xml entity is transported
using HTTP."

Is there an RFC about how RFCs interrelate?

Whole-site HTML validation, link checking and more
Received on Thursday, 1 May 2008 18:23:25 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 22:59:08 UTC