W3C home > Mailing lists > Public > www-tag@w3.org > September 2003

Re: Requesting a revision of RFC3023

From: MURATA Makoto <murata@hokkaido.email.ne.jp>
Date: Wed, 24 Sep 2003 22:59:23 +0900
To: "Addison Phillips [wM]" <aphillips@webmethods.com>
Cc: "WWW-Tag" <www-tag@w3.org>, ietf-xml-mime@imc.org
Message-Id: <20030924223233.091E.MURATA@hokkaido.email.ne.jp>


Thanks for your contribution to this discussion.

> The problem with changing rfc3023 is that there are a number of
> implementations out there that adhere to the exact letter of the involved
> RFCs (3023/2045/2046/etc.). I seem to recall that there are implementations
> that require the charset parameter or which forceably filter the data to
> ASCII (converting all 8th-bit bytes to the '?' character) and thus there are
> many implementations that, to get the right results with these, forceably
> emit charset parameters.
> Therefore, unless absolutely forbidden, implementations would still have to
> support the use of charset with both media types. And I don't see how we can
> forbid the use of the charset parameter given the need for need for
> interoperability with extant sensitive systems.

I think that this is a good reason to keep the charset parameter.

> <snip>
>        Conformant with [RFC2046], if a text/xml entity is received with
>        the charset parameter omitted, MIME processors and XML processors
>        MUST use the default charset value of "us-ascii"[ASCII].  In cases
>        where the XML MIME entity is transmitted via HTTP, the default
>        charset value is still "us-ascii".
> </snip>

This issue was discussed when RFC 2376 was developed.  I recall 
that my then-co-author (E. Whitehead) proposed exactly the same thing, but 
that proposal was not agreed.

In my understanding, MIME people in IETF would like to keep the charset 
parameter of text/* authoritative, since a number of mail programs rely only on 
the charset parameter for text/*.

However, as people correctly pointed out, omission of the charset 
parameter of text/xml is typically caused by the fact that authors 
cannot change the configuration of WWW servers.  For this reason, W3C 
recommendations for CSS and HTML say something similar to your suggestion, 
but the IETF RFC for CSS does not say anything about the default.  The 
IETF RFC 2854 for HTML says the default is US-ASCII (MIME) or 
8859-1 (HTTP), but also says that "the actual default is either a
corporate character encoding or character encodings widely deployed in a
certain national or regional community"


MURATA Makoto <murata@hokkaido.email.ne.jp>
Received on Wednesday, 24 September 2003 10:03:09 UTC

This archive was generated by hypermail 2.3.1 : Wednesday, 7 January 2015 15:32:39 UTC