Re: UTF-16 and MIME text/*

Bjoern Hoehrmann scripsit:
> * John Cowan wrote:
> >Bjoern Hoehrmann scripsit:
> >
> >>    RFC 2871 registers all UTF-16 charsets (UTF-16BE, UTF-16LE and
> >> UTF-16) as not suitable for use in MIME content types under the
> >> "text" top-level type. Why?
> >
> >Because a MIME processor, when encountering something of type text/*,
> >is allowed to assume that any 0x0A byte means "LF" and any 0x0D byte means "CR",
> >and to transmute them to some other kind of line ending.  UTF-16
> >of whatever flavor violates this rule.
> 
> Could you please give me some reference where MIME allows applications
> to _transmutate_ them? Someone poited out to me, that RFC 2871 is in
> error here and I tried hard to find something in MIME that clearly
> states, that RFC 2871 is correct in this regard.

The word "transmute" was ill-chosen.  RFC 2046, section 4.1.1, is quite
clear:

# The canonical form of any MIME "text" subtype MUST always represent a
# line break as a CRLF sequence.  Similarly, any occurrence of CRLF in
# MIME "text" MUST represent a line break.  Use of CR and LF outside of
# line break sequences is also forbidden.
#
# This rule applies regardless of format or character set or sets
# involved.

CR and LF here refer to the *octets* 0xD and 0xA respectively, as
explained in section 4.1.2, not to the characters.

-- 
John Cowan                                   cowan@ccil.org
One art/there is/no less/no more/All things/to do/with sparks/galore
	--Douglas Hofstadter

Received on Saturday, 11 August 2001 20:36:54 UTC