- From: Martin Duerst <duerst@it.aoyama.ac.jp>
- Date: Sat, 15 Mar 2008 09:27:18 +0900
- To: Julian Reschke <julian.reschke@gmx.de>, Brian Smith <brian@briansmith.org>
- Cc: "'HTTP Working Group'" <ietf-http-wg@w3.org>
(a side issue only) At 21:57 08/03/14, Julian Reschke wrote: >It seems the only way to improve RFC-2047 would be by introducing a new encoding that is sane. Such as: > >"Any octet sequence starting with EF BB BF (the UTF-8 BOM) is to be interpreted as Unicode, encoded in UTF-8." If we are speakind about RFC 2047 itself, then indeed no special sentinel (such as an UTF-8 BOM) would be neeeded. Any byte with the most significant bit set would be enough. Also, even for HTTP, mixing iso-8859-1 and UTF-8 might be fine in practice, because it's very easy to distinguish them. But all this would only make the mess bigger. It's much better to sort this out on a per header base (unless we can confirm that the current cruft isn't used at all in practice, which would then allow to go for UTF-8 in all cases where we need something more than US-ASCII). Regards, Martin. #-#-# Martin J. Du"rst, Assoc. Professor, Aoyama Gakuin University #-#-# http://www.sw.it.aoyama.ac.jp mailto:duerst@it.aoyama.ac.jp
Received on Monday, 17 March 2008 03:57:04 UTC