- From: Erik van der Poel <erik@netscape.com>
- Date: Fri, 15 May 1998 10:16:26 -0700
- To: Chris Newman <Chris.Newman@INNOSOFT.COM>
- Cc: MURATA Makoto <murata@apsdc.ksp.fujixerox.co.jp>, Larry Masinter <masinter@parc.xerox.com>, ietf-charsets@ISI.EDU, murata@fxis.fujixerox.co.jp, Tatsuo_Kobayashi@justsystem.co.jp
Chris, Thanks for engaging in this discussion. Chris Newman wrote: > On Fri, 15 May 1998, MURATA Makoto wrote: > > Makoto: > > This character set is not permitted for use with MIME text/* media > > types. However, the MIME-like mechanism of HTTP may use this > > character set for text/*. > > I prefer this one or anything along these general lines. > > > Erik: > > This charset is not suitable for use with text/* media types in > > protocols that are sensitive to the line break issues described in > > section 4.1.1 of RFC 2046 (MIME). However, this charset is suitable for > > use with text/* media types in other protocols. See also section 19.4.1 > > of RFC 2068 (HTTP). > > HTTP is the only protocol which uses MIME but doesn't follow the text > rules. I hope there will never be another. The rules for text media > types are not email centric. They were put in so that: > (1) Any text media type could be displayed directly to the user without > interpretation (treated as text/plain). Sure, let's just present the binary ones and zeroes directly to the users. They can understand it! ;-) Somewhat more seriously, what you wrote is quite ASCII-centric and/or Latin-centric. Languages like Japanese do not use the ASCII character codes. Nevertheless, you do have a point. If we sent raw UTF-16, and the stupid UA just blurted that out onto the screen, even American users would be baffled. Now we wouldn't want that, would we? I mean at least the American users should be spared, since they invented ARPANet and all. Oops, perhaps that went too far. I'm just kidding here. I assume you realize that. ObSmiley :-) > (2) Text has a canonical form so that signatures can be more easily > verified. Hmmm... Perhaps UTF-16 could also have some sort of canonical form defined for it, so that signatures would be possible. > (3) Even if the charset is unknown, dumping the message to the screen > might be useful. Again, ASCII-centric and Latin-centric thinking. > (4) It can normally be sent unencoded through line-oriented protocols > with line length limits. Yup, there are such protocols. E.g. SMTP. Would it be a good idea to list such protocols in some document? > I wouldn't be surprised if there are other good reasons for the text rules > that I'm not familiar with. Me too. I haven't given the MIME specs a good read lately, but they probably have some more info. Actually, the 4 points that you raise above should probably be in some RFC, no? It would be a waste to leave them in this obscure mailing list archive. Then again, maybe these points are already documented in MIME. > > Larry: > > In accordance with the rules on end-of-line convention and 'text/', > > UTF-16 is inappropriate for use with 'text' media types. Those media > > types which might be deployed with UTF-16 might consider registering an > > 'application' type as well. > > This omits mention of the HTTP exception, which seems important to some. Yeah, perhaps it's a waste of my time to fight this anti-HTTP sentiment. It's a shame, though, that we seem to be leaning towards using UTF-16 with application/* rather than text/*. It's inconsistent. And it's a sudden change from the text/html that everybody's already used to. Sigh. Larry's version seems fine to me. Perhaps a reference to the sections of MIME and HTTP that explain all this could be added to the end of the document? He says, giving it one more try. Erik --Boundary (ID uEbHHWxWEwCKT9wM3evJ5w)
Received on Friday, 15 May 1998 11:04:33 UTC