- From: Ian Hickson <ian@hixie.ch>
- Date: Mon, 4 Jun 2012 22:47:43 +0000 (UTC)
- To: Rafael Weinstein <rafaelw@google.com>
- cc: Webapps WG <public-webapps@w3.org>
On Fri, 25 May 2012, Rafael Weinstein wrote:
>
> Now's the time to raise objections to UA's adding support for this
> feature.
For the record, I very much object to Document.parse(). I think it's a
terrible API. We should IMHO resolve the use case of "generate a DOM tree
from script" using a much more robust solution that has compile-time
syntax checking and so forth, rather than relying on the super-hacky
"concatenate a bunch of strings and then parse them" solution that authors
are forced to use today.
innerHTML and document.write() are abominations unto computer science, and
we are doing nobody any favours by continuing the platform down this road.
They lead to programming styles that are rife with injection bugs (XSS),
they are extremely difficult to debug and maintain, and they are terribly
complicated to implement compared to more structured alternatives. The
core reasons for these problems, IMHO, are two-fold:
1. Lack of compile-time syntax checking, which leads to typos not being
caught and thus programmer intent not being faithfully represented,
and
2. Putting markup syntax and data at the same level, instead of having
separating them as with other features in JS.
For example, this kind of bug is easy to introduce and hard to spot or
debug:
var heading = '<h1>Hello</h1>';
// ...
div.innerHTML = '<h1>' + heading + '</h1>';
Even worse are things like typos:
tr.innerHTML = '<td>' + c1 + '</td><td>' + c2 + '</td><dt>' + c3 + '</td>;
Compile-time syntax checking makes this a non-issue. Making data variables
be qualitatively different than the syntax also solves problems, e.g.:
var title = "I hate </p> tags.";
// ...
div.innerHTML = '<p>Today's topic is: ' + title + '</p>'; // oops, not escaped
There have been several alternative proposals; my personal favourite is
Anne's E4H solution, basically E4X but simplified just for HTML, which
I've written a strawman spec for here:
http://www.hixie.ch/specs/e4h/strawman
I'm happy to write a more serious spec for this if this is something
anyone is interested in implementing. The above examples become much
easier to debug. The first one results in very ugly markup visible in the
output of the page rather than in the weird spacing:
var heading = '<h1>Hello</h1>';
// ...
div.appendChild(<h1>{heading}</h1>);
The second results in a compile-time syntax error so would be caught even
before the code is reviewed:
tr.appendChild(<><td>{c1}</td><td>{c2}</td><dt>{c3}</td></>);
The third becomes a non-issue because you don't need to escape text to
avoid it from being mistaken for markup [1]:
var title = "I hate </p> tags.";
// ...
div.innerHTML = <p>Today's topic is: {title}</p>;
Other proposed solutions include Element.create(), which is less verbose
than the DOM but still more verbose than innerHTML or E4H; and
quasistrings, which still suffer from lack of compile-time checking and
mix markup with data, but at least would be more structured than raw
strings and could offer better injection protection.
[1] (This is not the same as auto-escaping strings in other contexts. For
example, E4H doesn't propose to have CSS literals, so a string embedded in
a style="" attribute wouldn't be automagically safe.)
--
Ian Hickson U+1047E )\._.,--....,'``. fL
http://ln.hixie.ch/ U+263A /, _.. \ _\ ;`._ ,.
Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
Received on Monday, 4 June 2012 22:48:07 UTC