- From: Glenn Adams <glenn@stonehand.com>
- Date: Wed, 3 May 95 22:28:19 -0400
- To: erik@netscape.com
- Cc: Multiple recipients of list <html-wg@oclc.org>, http-wg%cuckoo.hpl.hp.com@hplb.hpl.hp.com
From: erik@netscape.com (Erik van der Poel) Date: Wed, 03 May 95 18:25:40 -0700 >We could say that the default encoding scheme is base line "ISO-2022". >... >How does this compromise sound? Brain dead or what? ISO 2022 actually does allow you to include info about the charsets used in a document, but then we would be tied to 2022, and might end up excluding other charsets (Big5? KOI8?). Which might be a good thing? It might be better to stick to the charset parameter, as defined by MIME. I'm not arguing for 2022. If it were up to me, I'd specify UTF-8 as the default. The point is: 1. We need language in the RFC that specifies what to do in the default case that the CHARSET parameter is not present in the Content-Type response. 2. If we specify 8859-1 that that may make ISO-2022-JP people unhappy. 3. If we specify ISO-2022-JP, it will make the the majority of users unhappy. 4. If we specify ISO-2022-EU (to give a name to the default state of ISO 2022 I previously described -- here E means Europe and U means US), then we essentially achieve a superset of simple 8859-1. That is, we specify an 8-bit code environment which starts out with 8859-1 being designated and invoked into GL and GR, and which, at the same time, allows for code switching to any other code set (including BIG5 and KOI8) via the standard 2022 mechanisms. It's "How do we get from the current situation to one where the charsets are labelled?" This is the pressing issue that I think Amanda is also concerned about. You bang on server providers and client providers to support interpretation of the CHARSET parameter. That *is* the prescribed mechanism. Alternatively, give the user the option of choosing between a number of default encodings. All I'm saying is, let's exercise some caution before blindly getting our servers to append the charset parameter to the content-type line. If we decide that we're willing to accept any pain and suffering caused by introducing the charset parameter blindly, then that's OK too. As long as we consciously decide to do so. I recall quite clearly when the ARPANET converted from NCP to TCP/IP. I was the host manager at a site which had 4 hosts connected to the ARPANET (that was a large number of connected hosts in those days). The way it worked is that a date was decided as the cutoff date for switch over; if you didn't have TCP/IP up by then you were just out of luck. We should do this for the CHARSET parameter. Do you have any other suggestions? A phased-in plan perhaps. Let's just agree to do it and do it. Regards, Glenn Adams
Received on Wednesday, 3 May 1995 19:29:25 UTC