- From: Martin J. Duerst <mduerst@ifi.unizh.ch>
- Date: Wed, 11 Dec 1996 17:40:32 +0100 (MET)
- To: Drazen Kacar <Drazen.Kacar@public.srce.hr>
- cc: Larry Masinter <masinter@parc.xerox.com>, Chris.Lilley@sophia.inria.fr, www-international@w3.org, Alan_Barrett/DUB/Lotus.LOTUSINT@crd.lotus.com, bobj@netscape.com, wjs@netscape.com, erik@netscape.com, Ed_Batutis/CAM/Lotus@crd.lotus.com
On Fri, 6 Dec 1996, Drazen Kacar wrote: > Larry Masinter wrote: > > # That implies that sending > > # Accept-Charset: utf-8 > > # Should generate a 406 response if the document is only available in, say, > > # Latin-1 and the server cannot convert that to UTF-8. > > > > I think Latin-1 is a special case. From > > draft-ietf-http-v11-spec-07.txt: > > > > # The ISO-8859-1 character set can be assumed to be acceptable to all > > # user agents. > > Come on, that was political compromise. ISO 8859-5 terminal can't > represent iso-8859-1 with q=1.0. User agent can do necessary translations, > but what actually gets displayed is not the same as on ISO 8859-1 > terminal. That wasn't a political compromize, it is a historical coincidence. At some point, if you wanted to join the internet, your computer had to understand ASCII is some way or another. The Web, because it started at CERN in Geneva, was ISO-8859-1 from the beginnig, and so to join the web, you have to understand ISO-8859-1. The web wasn't really designed for dumb terminals anyway. This historical coincidence is something I can accept. What is impossible for me to accept is such <FLAME>brainless stupidities</FLAME> as specifying that ISO-8859-1 can be used in HTTP 1.1 warnings in raw form, but anything else has to be encoded along RFC 1522. RFC 1522 is designed for 7-bit channels. If you have an 8-bit channel, there is no reason to use it. If you are using RFC 1522 anyway, there is no reason to give special preference to ISO-8859-1. If you have 8 bits available, there is no reason to fully use them for ISO-8859-1, removing any extensibility. To everybody in the i18n business, it is clear that if you are going to use 8-bit, you better use it for UTF-8. I see three ways (in rough order of preference) to get out of this problem: (1) Specify UTF-8 as the only thing to be used (2) Specify RFC 1522 for everything outside UTF-8 (which includes ASCII) (3) Specify RFC 1522 for everything outside ASCII (4) Specify RFC 1522 for everything outside Latin-1 and UTF-8 Comment to (4): Latin-1 and UTF-8 strictly speaking are not compatible. However, in practice, and for string lengths that will typically appear in warnings, they can be distinguished easily. The above problem is a very clear example of bad design. I hope the HTTP 1.1 draft can still be changed. If not, it would be a very clear reason for raising objections directly with IETF or whoever is responsible. Regards, Martin.
Received on Wednesday, 11 December 1996 11:41:17 UTC