W3C home > Mailing lists > Public > www-international@w3.org > October to December 2006

RE: ban the use and implementation of UTF-7

From: Paul Cotton <Paul.Cotton@microsoft.com>
Date: Fri, 22 Dec 2006 10:21:13 -0800
To: Martin Duerst <duerst@it.aoyama.ac.jp>, "Roy T. Fielding" <fielding@gbiv.com>, W3C TAG <www-tag@w3.org>
CC: Mark Davis <mark.davis@icu-project.org>, Deborah Goldsmith <goldsmit@apple.com>, "chris.newman@innosoft.com" <chris.newman@innosoft.com>, "mrc@washington.edu" <mrc@washington.edu>, "www-international@w3.org" <www-international@w3.org>, "ietf-charsets@iana.org" <ietf-charsets@iana.org>, Misha Wolf <Misha.Wolf@reuters.com>
Message-ID: <4D66CCFC0B64BA4BBD79D55F6EBC22571FD4B78849@NA-EXMSG-C103.redmond.corp.microsoft.com>

> But as far as the browsers are concerned, if the TAG can come
> up with a finding that e.g. also gives some more details and
> examples about the security issues you mention, then we might
> also be able to point to this document from anything on the
> IETF or IANA side.

Here is a publicly available description of this problem:
http://archives.neohapsis.com/archives/fulldisclosure/2006-10/0296.html

/paulc

Paul Cotton, Microsoft Canada
17 Eleanor Drive, Ottawa, Ontario K2E 6A3
Tel: (613) 225-5445 Fax: (425) 936-7329
mailto:Paul.Cotton@microsoft.com





> -----Original Message-----
> From: www-tag-request@w3.org [mailto:www-tag-request@w3.org] On Behalf Of
> Martin Duerst
> Sent: December 15, 2006 2:26 AM
> To: Roy T. Fielding; W3C TAG
> Cc: Mark Davis; Deborah Goldsmith; chris.newman@innosoft.com;
> mrc@washington.edu; www-international@w3.org; ietf-charsets@iana.org;
> Misha Wolf
> Subject: Re: ban the use and implementation of UTF-7
>
>
> Hello Roy,
>
> As you can see at
> http://lists.w3.org/Archives/Public/www-international/2006OctDec/0144,
> Mark Davis, one of the authors, essentially agrees with you.
> In a followup on the ietf-charsets mailing list, Deborah Goldsmith,
> the other author of the UTF-7 spec, also agrees.
>
> The only place I'm aware that (a variant!) of UTF-8 is used is
> for IMAP folder name internationalization. See e.g.
> http://www.ietf.org/rfc/rfc2192.txt for details.
> In hindsight, using an UTF-7 variant in the protocol seems
> unnecessary, but the original idea (mostly by Mark Crispin,
> as far as I understand it) was that it could be used as is
> on the server side, even on totally un-internationalized
> operating systems.
>
> As for the browsers, I think they just added UTF-7 at one time
> because the name looked similar to UTF-8 and UTF-16, and it was
> difficult to predict exactly how these encodings would deploy.
> And as in any software, it's difficult to get rid of something,
> but security reasons are about the best you can come up with
> for cleaning up.
>
> As for the IANA charset registry
> (http://www.iana.org/assignments/character-sets), Ned and
> me (who are currently the expert reviewers) as well as the
> other list participants have been talking about cleaning it
> up. We don't currently yet have an exact idea of what needs
> to be done, but being able to attach security warnings or
> similar comments to an entry might be one possible way to
> proceed. The problem might be that RFC 2152
> (http://www.ietf.org/rfc/rfc2152.txt) might have to be updated.
>
> But as far as the browsers are concerned, if the TAG can come
> up with a finding that e.g. also gives some more details and
> examples about the security issues you mention, then we might
> also be able to point to this document from anything on the
> IETF or IANA side.
>
> Regards,     Martin.
>
> At 07:13 06/12/15, Roy T. Fielding wrote:
> >
> >Over the years I have seen a number of security exploits that make
> >use of broken browsers that sniff character encodings in combination
> >with UTF-7 encoded tags or javascript commands.  I have never actually
> >seen anyone use UTF-7 for anything legitimate (other than testing).
> >
> >Is there some reason why WWW clients need to support UTF-7?
> >
> >It seems completely unnecessary given the now ubiquitous use of 8-bit
> >clean transports and the presence of UTF-8, which IIRC was defined
> >long after UTF-7.  However, the wider community may be aware of
> >some reason why browsers should support it, so I'd like to hear
> >your comments.
> >
> >If there is no need for UTF-7, I'd like the TAG to consider it an
> >issue for the sake of asking browsers to remove its implementation
> >and banning its use by servers.
> >
> >I know this won't solve any problems for deployed clients, and
> >wouldn't be an issue at all if servers used the same algorithm for
> >escaping characters that clients used to interpret them, but in the
> >long term it will simplify some checks for XSS attacks and I don't
> >think it will harm the Web.  That is, unless there is some significant
> >body of content out there that is encoded as UTF-7.
> >
> >Cheers,
> >
> >Roy T. Fielding                            <http://roy.gbiv.com/>
> >Chief Scientist, Day Software              <http://www.day.com/>
> >
> >
>
>
> #-#-#  Martin J. Du"rst, Assoc. Professor, Aoyama Gakuin University
> #-#-#  http://www.sw.it.aoyama.ac.jp       mailto:duerst@it.aoyama.ac.jp
>
Received on Friday, 22 December 2006 18:21:56 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 2 June 2009 19:17:09 GMT