- From: Martin Duerst <duerst@it.aoyama.ac.jp>
- Date: Thu, 27 Mar 2008 12:06:43 +0900
- To: "Frank Ellermann" <hmdmhdfmhdjmzdtjmzdtzktdkztdjz@gmail.com>, ietf-http-wg@w3.org
At 02:04 08/03/27, Frank Ellermann wrote: > >Martin D$B—S(Bst wrote: >>| <META HTTP-EQUIV="Content-Type" >>| CONTENT="text/html; charset=ISO-2022-JP"> >>| >>| This is not foolproof, but will work if the encoding >>| scheme is such that ASCII-valued octets stand for >>| ASCII characters only at least until the META element >>| is parsed. > >> [This is very, very widely used. As far as it's HTML, >> it's nothing HTTP should be concerned, but it is highly >> relevant for HTTP because it is dead straight against >> any default on the charset parameter in HTTP.] > >Wait a moment, it is dead straight against any default that >is *NOT* ASCII, or rather against a default not containing >ASCII as proper subset. I think we have to be careful what a HTTP default means. What a US-ASCII default on the HTTP level means is essentially that whenever I get something like: Content-Type: text/foo I should treat this exactly as if I got: Content-Type: text/foo; charset=US-ASCII Now the later means: This is US-ASCII, nothing less and nothing more. Given that, the browser won't look inside the document anymore for any additional information. But this is not what happens in practice, and not what we want. It could be that by default above, you meant: something to go back to if *all* else fails. That would mean that the "default" is only applied if there is no other information anywhere available about the encoding. If that's what we want to say, simply saying "default" is definitely not good enough. And I doubt it is actually what happens in practice, because if no information is found, it's usually the browser menu setting (based on the browser's user interface language or the user's choice) that kicks in, before a final default has any chance to be used. >Arguably it also tells us that the "default" does not mean >much for HTTP. It is interesting for HTTP header fields. If "default" doesn't mean much, then we shouldn't call it default. >For the text/* [i20] issue we might be free to pick ASCII >instead of Latin-1 if that's better for MIME compatibility, >especially for text/plain, naturally for text/xml, and no >problem for text/html. Many (including me) are advising against text/xml, because at least according to the books, the US-ASCII default for text/xml is supposed to be a real default, i.e. there is no chance to have an internal encoding information work for text/xml. >The main problem I have with the "Latin-1 default" is that >it blocks a future "UTF-8 default" (talking about HTTP/1.1) There is no need for such a default. Practice already works without a default. Regards, Martin. #-#-# Martin J. Du"rst, Assoc. Professor, Aoyama Gakuin University #-#-# http://www.sw.it.aoyama.ac.jp mailto:duerst@it.aoyama.ac.jp
Received on Thursday, 27 March 2008 03:07:55 UTC