- From: Yung-Fong Tang <ftang@netscape.com>
- Date: Tue, 14 Aug 2001 12:28:58 -0700
- To: John Cowan <cowan@mercury.ccil.org>
- CC: Bjoern Hoehrmann <derhoermi@gmx.net>, www-international@w3.org, phoffman@imc.org
John Cowan wrote: > Bjoern Hoehrmann scripsit: > > > If you consider 0x00 0x0d 0x00 0x0a or 0x0d 0x00 0x0a 0x00 in the UTF-16 > > data, then this paragraph applies, since it refers to the _decoded_ form > > of the data; RFC 2046 doesn't make restrictions on the encoded > > form of the data. What do I miss? > > No, it's the encoded form that is being restricted. The whole point of > this is so that naive processors that understand only ASCII and text/plain > can at least figure out where the line breaks are, since for local presentation > purposes (not for retransmission) it may be necessary to convert the > standard line break (0xD 0xA) into something else. Charsets such as > EBCDIC and UTF-16 (in all their flavors) break this rule and can't be used > in MIME text/* emails. If it is the "encoded form" that is being restricted, then you will still not see 0x00 0x0d 0x00 0x0a in the "encoded form" if you use the following, right ? Content-Type: text/plain; charset=UTF-16 Content-transfer-encoding: base64 > > > > >CR and LF here refer to the *octets* 0xD and 0xA respectively, as > > >explained in section 4.1.2, not to the characters. > > > > This sections deals with the Charset Parameter and deals with US-ASCII > > but I can't read such a statement and I'm not sure if it would apply if > > there were. > > See RFC 822 for formal definitions of CR, LF, and CRLF, where it is > made clear that they are octet based. > > -- > John Cowan cowan@ccil.org > One art/there is/no less/no more/All things/to do/with sparks/galore > --Douglas Hofstadter
Received on Tuesday, 14 August 2001 15:31:50 UTC