W3C home > Mailing lists > Public > www-tag@w3.org > December 2011

Re: Opera reparses as HTML when XML parse fails

From: Henri Sivonen <hsivonen@iki.fi>
Date: Fri, 16 Dec 2011 18:08:06 +0200
Message-ID: <CAJQvAueiH1E_F4L0TQtRbS5dyPPo9a2NkH1gy5UGGgQ2rv+7yg@mail.gmail.com>
To: Noah Mendelsohn <nrm@arcanedomain.com>
Cc: "www-tag@w3.org" <www-tag@w3.org>, Sam Ruby <rubys@intertwingly.net>, Norm Walsh <ndw@nwalsh.com>
On Wed, Dec 14, 2011 at 5:55 AM, Noah Mendelsohn <nrm@arcanedomain.com> wrote:
> Now, application/xhtml+xml content would not give any trouble if the XHTML
> code on these sites was well-formed, but unfortunately, mistakes are easily
> made..."

I think the solution Opera has chosen is a bad one. For text/html, we
went through a lot of trouble to change things so that an attacker who
is able to force a premature end of the input stream can't trigger
reparsing that would lead to parts of the page that weren't meant to
be scripts to be interpreted as scripts.

AFAICT, Opera's solution introduces such dangerous reparsing to
applicaiton/xhtml+xml. Also, reparsing is bad for performance and can
cause the side effects of scripts to happen twice.

I think non-Draconian non-backtracking parsing rules for XML (such as
XML5) would have been a better solution.

-- 
Henri Sivonen
hsivonen@iki.fi
http://hsivonen.iki.fi/
Received on Friday, 16 December 2011 17:07:11 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Thursday, 26 April 2012 12:48:44 GMT