- From: Francois Yergeau <yergeau@alis.ca>
- Date: Sun, 7 Jul 1996 21:32:42 -0500
- To: http-wg%cuckoo.hpl.hp.com@hplb.hpl.hp.com
> Date: Fri, 05 Jul 1996 17:25:38 -0700 > From: "Roy T. Fielding" <fielding@liege.ICS.UCI.EDU> > > I have already covered these questions ad-nauseum. > > 1) HTTP has *always* used a default charset value of ISO-8859-1. This is wrong. Repeating a falsehood ad nauseam doesn't make it any truer. The default has stopped being ISO 8859-1 ever since the very first non-Latin-1 document was transmitted - unlabelled. Servers have never sent charset, and thus have always "defaulted" to whatever was in the document being sent, Latin-1 or not. Clients have always defaulted to whatever was the (generally) unique encoding they could deal with, Latin-1 or not. Proxies have never paid any attention, and have never defaulted to anything. > All implementations to the contrary had KNOWN failure conditions > and did not work as intended except within locally controlled > environments. All implementations that assume Latin-1 have KNOWN failure conditions and do not work as intended when presented with unlabelled non-Latin-1 documents, that is outside of locally controlled Latin-1 environments. ISO 8859-1 is just a locally useful charset, whose relative importance is shrinking daily. > 2) The HTTP version defines the communication capability of the > immediately adjacent client or server -- it NEVER indicates that > feature capabilities of the user agent. Who said anything else? The version number indicates what protocol features can be used. A server receiving 1.0 knows it can reply with MIME-like headers followed by a blank line and an entity, which it cannot do with with 0.9. My point was that a client sending 1.1 should guarantee that it can deal gracefully with a charset parameter (behave no worse than without charset), something that is not true today with 1.0, despite the language in RFC 1945. > 3) None of the issues you have raised involve a technical problem > with the HTTP/1.1 protocol -- they are POLITICAL problems that > are an artifact of historical reality, a reality which the IETF > is not capable of changing. By attempting to enforce a default charset with no justification in current practice or technical, the IETF would be making a political statement: the western hemisphere wants a free ride, let the rest of the world deal with the charset labelling problem. You are right that Latin-1 as a default is an historical artefact; it *was* the default when it was the only encoding, but has not been ever since. > 4) Labelling the charset with its real value if it is different than > iso-8859-1 *always* works, both in old an new practice,... You're seeing only one side of the coin. The world does not revolve around ISO 8859-1. Today, lots of people *need* to set their browser to a default other than Latin-1, because most of the stuff they read is in languages not representable in Latin-1 and is unlabelled. When they get a document - unlabelled - in Latin-1 (or anything else but their current default), they see garbage. No, it doesn't always work. > 5) Whether or not a client is capable of understanding the charset > parameter is NOT a function of the protocol version -- ALL HTTP/1.0 > clients MUST understand charset, even if HTTP/1.0 is not a "standard", > because that is part of the HTTP/1.0 definition (see RFC 1945). See above. True in theory, but not (yet) in practice. Hence a server cannot count on the 1.0 label to mean that the client will grok charset; I know, I've tried, got burned and had to hack something based on User-Agent:. > I see no point in continuing this discussion unless you can demonstrate > a real problem that needs to be solved and can be solved within the > constraints of HTTP/1.1. I have described the problem above. It lies outside of a Latin-1-centric world, so perhaps you have not encountered it before, but it nevertheless exists. It really makes interoperability a pipedream in places where Latin-1 is irrelevant, and even more where multiple encodings are used for one language. As for the constraints of HTTP/1.1, there has been a lot of loose talk about backward compatibility, but no definite problem case cited, except for a potential one with proxies. The latter was, I think, fixed if only origin servers are required to send charset. If that is not the case, please say so. I'd rather hear this than "Let's not discuss it". Finally, why is HTTP/1.1 constrained to mandate English and ISO 8859-1 as defaults in the new Warning: header? I have not seen any justification, so let's please fix that. Has anyone noticed that mandating a default language amounts to telling those who do not speak that language that they cannot write complete 1.1 servers, lest they hire a translator? Regards, -- Francois Yergeau <yergeau@alis.com> Alis Technologies Inc., Montreal Tel : +1 (514) 747-2547 Fax : +1 (514) 747-2561
Received on Sunday, 7 July 1996 18:41:28 UTC