W3C home > Mailing lists > Public > public-script-coord@w3.org > January to March 2013

Re: E4H and constructing DOMs

From: Mike Samuel <mikesamuel@gmail.com>
Date: Fri, 8 Mar 2013 00:07:54 -0500
Message-ID: <CACod6GttU9jFLwV4NJn=iFRNYK7aqo0ZgegD1MTCmP5MdO4Kwg@mail.gmail.com>
To: Maciej Stachowiak <mjs@apple.com>
Cc: "Mark S. Miller" <erights@google.com>, Jonas Sicking <jonas@sicking.cc>, "public-script-coord@w3.org" <public-script-coord@w3.org>
2013/3/7 Maciej Stachowiak <mjs@apple.com>:
> https://code.google.com/p/google-caja/issues/detail?id=1670
> I strongly suspect there are more bugs than the one I found, as the regexp
> looks way too simple to capture the full behavior of the relevant HTML
> tokenizer states. Regrettably I do not have the time or expertise to hunt
> for more.

Here's the context


It's used in a function that strips out tags from a string before '<'
and '>' are escaped with '&lt;' and '&gt;'.

This is so that accidental inclusion of a string of "known-safe HTML"
(not an untrusted input) in the value of an HTML attribute doesn't
cause tags to appear in, e.g. title hover text.  This is not part of
the TCB.

I suspect there are other bugs too as there always are in software and
as there will be in any AST solution as well.
Received on Friday, 8 March 2013 05:08:21 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 17:14:08 UTC