- From: Joshua Cranmer <Pidgeot18@verizon.net>
- Date: Mon, 10 Nov 2014 12:21:42 -0600
- To: public-htmail@w3.org
On 11/10/2014 6:45 AM, chaals@yandex-team.ru wrote: > One of the things they want to do before finishing it is describe how HTML gets cleaned up for security before pasting into a random page. This may or may not be similar to the things that are removed from mail when it is e.g. presented in Webmail for security reasons. > > I don't expect to get a copy of everyone's security policies in detail, but I think it would be useful to at least list common things that are "removed" for security purposes, along with some explanation of the reason. HTML sanitization I would presume is usually implemented on a whitelist basis, particularly in email (which tends to be far more conservative). > For example I presume that more or less everyone takes out javascript "eval" statements, because there is no way to automatically check that they will do no harm. The client I work on (Thunderbird) disabled even the ability to enable JavaScript several years ago when we stopped trusting the sandboxing of JS execution [1]. I am unaware of any other client that ever attempted to support JavaScript in email in the first place (which is why we dropped support instead of trying to fix sandboxing or even let the user shoot themselves in the foot). In general, JavaScript cannot be statically sanitized with any degree of precision. I can think of at least three distinct ways to get something akin to eval, and the ability to access x.foo via x['foo'] renders precision equivalent to the halting problem. Not to mention the ways in which you can dynamically inject more JavaScript, which makes static sanitization without dynamic sandboxing treacherous. > Would it be good to have a page to collect this in our wiki, or are people prepared to send at least some of the stuff to the mailing list (and a volunteer - I see one in the mirror - could start to gather them in a wiki)? I will note that Thunderbird primarily relies on sandboxing rather than sanitization. So features like SVG, MathML, even <audio> and <video> already work with no extra effort on our part! The sandbox unconditionally disables JavaScript and plugin execution; forms won't submit (but we erroneously render them); remote content loads (e.g., images, videos) are disabled by default, but the user can enable them on a per-message or per-sender basis. We do have an option to sanitize HTML prior to display for paranoia purposes (and an option to enable that sanitization for spam messages, but I don't know if it's enabled by default). [1] Coarse-grained details: all JS access to the DOM used to go through a single dispatch point, where generic sandboxing policies could be easily applied. Since the single dispatch was slow as crap, the DOM accesses were rerouted, and the rerouting opted to not support a generic sandbox policy. -- Beware of bugs in the above code; I have only proved it correct, not tried it. -- Donald E. Knuth
Received on Monday, 10 November 2014 18:22:26 UTC