- From: <ned.freed@mrochek.com>
- Date: Fri, 22 Nov 2002 10:52:04 -0800 (PST)
- To: Marcin Hanclik <mhanclik@poczta.onet.pl>
- Cc: ietf-charsets@iana.org
> Dear Sirs, > I am writing to you as to the experts in internationalization and ISO-10646 > issues. > I would be very grateful if you could help me with the following issue > described below. > Generally the question refers to MIME encoding of text part. > Particularily to the following case: > Content-Type: text/plain; charset="iso-10646-ucs-2" > Content-Transfer-Encoding: ... This, I'm afraid, is an illegal combination of elements. Specifically, any material with a top level media type of "text" has to represent carriage return/line feed as the literal sequence 0x13 0x10. iso-10646-ucs-2 clearly does not do this, and as such is a media type that's not suited for use with MIME text. This requirement is spelled out in RFC 2046 section 4.1.1. > Data > Data after decoding: 0xFF 0xFE 0x66 0x00 0x65 0x00 > Outlook Express decodes it to "fe" string. But there are people, who say > that this is robustness of Outlook Express and that the string is not > properly encoded, because in the time when <charset="iso-10646-ucs-2"> was > specified/assigned with IANA the byte order mark (BOM) did not exist. I don't know if there are specific rules for handling revisions to iso-10646-ucs-2 or not. I suspect not. However, the general rule is that additions to a charset repetertoire are expected and allowed. See RFC 2279 section 3. However, the BOM is something of a special case. But given the far more egregious violation going on here I really don't think this is particular important in the overall scheme of things. Ned
Received on Friday, 22 November 2002 14:01:41 UTC