- From: Ian Hickson <ian@hixie.ch>
- Date: Mon, 4 Jun 2012 22:47:43 +0000 (UTC)
- To: Rafael Weinstein <rafaelw@google.com>
- cc: Webapps WG <public-webapps@w3.org>
On Fri, 25 May 2012, Rafael Weinstein wrote: > > Now's the time to raise objections to UA's adding support for this > feature. For the record, I very much object to Document.parse(). I think it's a terrible API. We should IMHO resolve the use case of "generate a DOM tree from script" using a much more robust solution that has compile-time syntax checking and so forth, rather than relying on the super-hacky "concatenate a bunch of strings and then parse them" solution that authors are forced to use today. innerHTML and document.write() are abominations unto computer science, and we are doing nobody any favours by continuing the platform down this road. They lead to programming styles that are rife with injection bugs (XSS), they are extremely difficult to debug and maintain, and they are terribly complicated to implement compared to more structured alternatives. The core reasons for these problems, IMHO, are two-fold: 1. Lack of compile-time syntax checking, which leads to typos not being caught and thus programmer intent not being faithfully represented, and 2. Putting markup syntax and data at the same level, instead of having separating them as with other features in JS. For example, this kind of bug is easy to introduce and hard to spot or debug: var heading = '<h1>Hello</h1>'; // ... div.innerHTML = '<h1>' + heading + '</h1>'; Even worse are things like typos: tr.innerHTML = '<td>' + c1 + '</td><td>' + c2 + '</td><dt>' + c3 + '</td>; Compile-time syntax checking makes this a non-issue. Making data variables be qualitatively different than the syntax also solves problems, e.g.: var title = "I hate </p> tags."; // ... div.innerHTML = '<p>Today's topic is: ' + title + '</p>'; // oops, not escaped There have been several alternative proposals; my personal favourite is Anne's E4H solution, basically E4X but simplified just for HTML, which I've written a strawman spec for here: http://www.hixie.ch/specs/e4h/strawman I'm happy to write a more serious spec for this if this is something anyone is interested in implementing. The above examples become much easier to debug. The first one results in very ugly markup visible in the output of the page rather than in the weird spacing: var heading = '<h1>Hello</h1>'; // ... div.appendChild(<h1>{heading}</h1>); The second results in a compile-time syntax error so would be caught even before the code is reviewed: tr.appendChild(<><td>{c1}</td><td>{c2}</td><dt>{c3}</td></>); The third becomes a non-issue because you don't need to escape text to avoid it from being mistaken for markup [1]: var title = "I hate </p> tags."; // ... div.innerHTML = <p>Today's topic is: {title}</p>; Other proposed solutions include Element.create(), which is less verbose than the DOM but still more verbose than innerHTML or E4H; and quasistrings, which still suffer from lack of compile-time checking and mix markup with data, but at least would be more structured than raw strings and could offer better injection protection. [1] (This is not the same as auto-escaping strings in other contexts. For example, E4H doesn't propose to have CSS literals, so a string embedded in a style="" attribute wouldn't be automagically safe.) -- Ian Hickson U+1047E )\._.,--....,'``. fL http://ln.hixie.ch/ U+263A /, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
Received on Monday, 4 June 2012 22:48:07 UTC