- From: Jamie Lokier <jamie@shareable.org>
- Date: Fri, 28 Mar 2008 14:17:18 +0000
- To: Stefan Eissing <stefan.eissing@greenbytes.de>
- Cc: Mark Nottingham <mnot@mnot.net>, HTTP Working Group <ietf-http-wg@w3.org>
Stefan Eissing wrote: > > Am 28.03.2008 um 10:45 schrieb Jamie Lokier: > > >Stefan Eissing wrote: > >>>1) Change the character encoding on the wire to UTF-8 > >> > >>-1 > >>[...] > >So, in the case of receiving RFC2047 _or_ binary UTF-8, HTTP > >implementations using character strings internally will actually pass > >character sequences which aren't the intended "meaningful" characters, > >except for those in the US-ASCII subset. > > > >In that respect, binary UTF-8 on the wire doesn't change anything from > >the present situation with RFC2047 :-) > > You are correct that the information would still be there. And it is > tempting to shoot for UTF-8. ... > And everyone will keep their fingers crossed > that they do not encounter an intermediary that makes some "security > filtering" on HTTP headers and screws it up. I'm thinking the same applies to RFC2047, if that becomes actually implemented in practice (it currently isn't). Surely the security issues with RFC2047 decoding among different implementations are _much_ more likely than those of binary UTF-8? E.g. =?iso-8859-1?q?=00?= will be a string terminator for some components of some implementations, rejected by others, and passed through as ASCII by many. Expect "security filtering" to have opinions about such sequences for good reasons, as soon as HTTP recipients start decoding RFC2047 in the headers and reacting badly to such sequences. UTF-8 has similar issues, but they are relatively well defined. With RFC2047, it's more open ended. -- Jamie
Received on Friday, 28 March 2008 14:18:00 UTC