W3C home > Mailing lists > Public > www-international@w3.org > July to September 2001

Re: UTF-16 and MIME text/*

From: John Cowan <cowan@mercury.ccil.org>
Date: Sat, 11 Aug 2001 20:37:02 -0400 (EDT)
To: Bjoern Hoehrmann <derhoermi@gmx.net>
CC: John Cowan <cowan@mercury.ccil.org>, www-international@w3.org, phoffman@imc.org
Message-Id: <E15VjFW-0004xB-00@mercury.ccil.org>
Bjoern Hoehrmann scripsit:
> * John Cowan wrote:
> >Bjoern Hoehrmann scripsit:
> >
> >>    RFC 2871 registers all UTF-16 charsets (UTF-16BE, UTF-16LE and
> >> UTF-16) as not suitable for use in MIME content types under the
> >> "text" top-level type. Why?
> >
> >Because a MIME processor, when encountering something of type text/*,
> >is allowed to assume that any 0x0A byte means "LF" and any 0x0D byte means "CR",
> >and to transmute them to some other kind of line ending.  UTF-16
> >of whatever flavor violates this rule.
> 
> Could you please give me some reference where MIME allows applications
> to _transmutate_ them? Someone poited out to me, that RFC 2871 is in
> error here and I tried hard to find something in MIME that clearly
> states, that RFC 2871 is correct in this regard.

The word "transmute" was ill-chosen.  RFC 2046, section 4.1.1, is quite
clear:

# The canonical form of any MIME "text" subtype MUST always represent a
# line break as a CRLF sequence.  Similarly, any occurrence of CRLF in
# MIME "text" MUST represent a line break.  Use of CR and LF outside of
# line break sequences is also forbidden.
#
# This rule applies regardless of format or character set or sets
# involved.

CR and LF here refer to the *octets* 0xD and 0xA respectively, as
explained in section 4.1.2, not to the characters.

-- 
John Cowan                                   cowan@ccil.org
One art/there is/no less/no more/All things/to do/with sparks/galore
	--Douglas Hofstadter
Received on Saturday, 11 August 2001 20:36:54 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 2 June 2009 19:16:57 GMT