- From: Ian Hickson <ian@hixie.ch>
- Date: Fri, 16 Jan 2009 07:09:36 +0000 (UTC)
- To: Maciej Stachowiak <mjs@apple.com>
- Cc: Boris Zbarsky <bzbarsky@MIT.EDU>, public-html <public-html@w3.org>
On Thu, 14 Aug 2008, Maciej Stachowiak wrote:
> On Aug 14, 2008, at 1:33 PM, Ian Hickson wrote:
> > On Wed, 13 Aug 2008, Boris Zbarsky wrote:
> > > >
> > > > I don't understand the security risk. Could you elaborate on what
> > > > the threat is?
> > >
> > > The obvious threat is that someone writes (or wrote awhile back)
> > > something, tests (or tested) in their browser, it doesn't render as
> > > HTML (or didn't back when they tested), then we render it as HTML.
> > >
> > > Obvious examples that come up are image types in IE, or a whole slew
> > > of stuff in Netscape 4 (think old site that no one has bothered to
> > > update, and yes such things still exist: we get people complaining
> > > that they can't document.open('application/postscript') in current
> > > Gecko).
> >
> > Fair enough.
> >
> > The risk of implementing this as Firefox does, of course, is lack of
> > compatibility with pages that are expecting HTML handling. To gain
> > some level of compatibility we have to, at a minimum, strip leading
> > and trailing space characters, and ignore any content after the first
> > semicolon.
> >
> > Now the question is, are other browser vendors willing to change to
> > this?
> >
> > I've changed the spec for now, but I would really appreciate
> > confirmation from WebKit, Opera, and IE representatives that this
> > change is one that the majority of browser vendors are willing to
> > implement.
>
> WebKit doesn't match either Firefox or IE currently (we always use
> text/html as you said). I would prefer to go with the IE behavior or
> something close to it. I think the security risk of defaulting unknown
> types to text/html is very small. There may be sites that have not been
> updated since the Netscape 4 days, but it's unlikely any have enough
> regular users to be targeted by security attacks. On the other hand, it
> seems the compatibility risk is real, since Firefox must do trickier
> parsing to catch some types that must indeed be treated as text/html.
>
> Admittedly, this opinion is not informed by extensive testing.
I tried reverse engineering what IE does here but I lost patience with the
weird behavior I was seeing before I managed to get a coherent picture, so
I left the spec as is (more or less matching Gecko's behaviour).
As far as I can tell, IE does an ASCII insensitive comparison against
the string "text/plain", without trimming spaces or doing anything with
semicolons. If it finds a match it does the PLAINTEXT thing. Otherwise it
does the HTML thing unless the type is a known image/* type, in which case
it throws an exception.
The spec behavior is to drop anything after a semicolon, trim spaces, and
do a case-insensitive match against "text/html". If it finds a match, it
does the HTML thing. Otherwise it does the PLAINTEXT thing.
--
Ian Hickson U+1047E )\._.,--....,'``. fL
http://ln.hixie.ch/ U+263A /, _.. \ _\ ;`._ ,.
Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
Received on Friday, 16 January 2009 07:10:13 UTC