- From: David Ross <drx@google.com>
- Date: Sat, 23 Jan 2016 00:43:42 -0800
- To: Jim Manico <jim.manico@owasp.org>
- Cc: Michal Zalewski <lcamtuf@coredump.cx>, Chris Palmer <palmer@google.com>, Crispin Cowan <crispin@microsoft.com>, Craig Francis <craig.francis@gmail.com>, Conrad Irwin <conrad.irwin@gmail.com>, "public-webappsec@w3.org" <public-webappsec@w3.org>
Ack, I have to admit my first sentence kind of mis-characterizes Jim's position a bit. Just to clarify: I understand that Jim isn't advocating for the use of a big list of tags, etc. in the sanitizer or its configuration. Sorry about that! Dave On Sat, Jan 23, 2016 at 12:27 AM, David Ross <drx@google.com> wrote: > Ok, I see that you're saying there would be less maintenance because > the big list of hundreds of known-good tags, attributes, and CSS would > be punted from sanitizer defaults to instead being specified in > configuration. But this doesn't mean that there is no list to manage > _somewhere_. It just changes the party responsible for managing the > list. I'd argue that the sanitizer itself is in the best position to > get this right, which is why jSanity maintains a list of known-good > tags, attributes, and CSS properties. > > I also believe you're saying that this provides more strict validation > because the consumer of the sanitizer would supply just a short list > of tags to whitelist. I think in practice sanitizer consumers very > often require a baseline configuration that allows a broad set of > tags, attributes, and CSS properties that are incontrovertibly safe. > That can be a big list, and again it's something that the sanitizer > would in the best position to manage properly. > > Can you imagine stumbling across one of these sanitizer configurations > in a pentest? Every configuration would be different, and surely some > would have gotten bloated with various tags and attributes over time. > What a goldmine for bugs! > > I agree that if users required a basic sanitizer that only let a few > things through, you could take this approach and avoid hardcoding big > lists that require maintenance. But then I think that type of > sanitizer would only have the relative advantages of low maintenance > and strict validation in the use cases that don't require robust > markup. In other cases it would tend to create more of a problem than > it would solve. I also don't see that it would be advantageous to > build this type of sanitizer into the browser -- a tiny javascript > library should work fine. > > Dave > > > On Fri, Jan 22, 2016 at 5:14 PM, Jim Manico <jim.manico@owasp.org> wrote: >>> Can you get a little more specific about what you're suggesting? >> >> Something along the lines of.... >> >> sanitize(rawHTML, policy); >> >> Which would be called like the following but with a better policy mechanism. >> >> coolwidget.innerHTML= sanitize(rawHTML, "<b>, <i>, <a>"); >> >> "One of your own" created an HTML Sanitizer that has a much more fully >> featured policy rule mechanism that you can check out here. >> https://www.owasp.org/index.php/OWASP_Java_HTML_Sanitizer_Project#tab=Creating_a_HTML_Policy >> >> My conjecture is that once this is working properly (which is rough) it will >> require a lot less maintenance when new markup features are added. >> >> But make no mistake, one of the reasons I support doing both is because the >> simplicity of what you are doing for developers is compelling. But I think >> stricter validation like I am suggesting is valuable as well. >> >> Aloha, >> Jim >> >> >> On 1/22/16 8:03 PM, David Ross wrote: >>>> >>>> Is my concern that your policy-sandbox would need constant >>>> updating as new browser features were added a fair concern? >>> >>> Any sanitizer needs some ongoing level maintenance already today. A >>> lot of that is just to add support for (whitelist) new browser >>> features, and then to backtrack a bit if that turns out not to have >>> been such a good idea. =) When you've got a sanitizer written in C++ >>> and baked into a browser, updating that sanitizer in this way might be >>> even more burdensome. >>> >>> In the case of Safe Node, we would _not_ generally make one-off >>> changes to tweak the code to add or remove support for new elements, >>> attributes, etc. Adding any new feature, the question would be this: >>> Walking down the list of Safe Node enforced policies, would the new >>> featue subvert any of them? If so _and_ the new feature doesn't >>> >>> leverage existing building blocks that are already regulated by >>> policy, _then_ there needs to be additional policy enforcement put in >>> place. So I think that an implementation of Safe Node would require >>> less ongoing maintenance than a sanitizer baked into the browser. >>> >>>> Do you think supporting some kind of HTML policy engine like >>>> I'm suggesting is valid at all? >>> >>> Can you get a little more specific about what you're suggesting? >>> >>> Dave >>> >>> On Fri, Jan 22, 2016 at 4:43 PM, Jim Manico <jim.manico@owasp.org> wrote: >>>>> >>>>> and certainly it's no more blacklist-based than a sanitizer >>>> >>>> Hmmm. My thinking was "Davids proposal is going to disable certain >>>> features. >>>> HTML sanitizers only try to enforce good tags without needing any >>>> knowledge >>>> of the bad stuff". That is why I think of your work as "blacklist" and >>>> HTML >>>> sanitizers as "whitelist". >>>> >>>> Anyhow, it sure was an Edge-ie case! Thank you for catching my lame pun. >>>> I >>>> know this is going to hurt you to hear it, but IE and Edge matter. I'm >>>> glad >>>> to know your proposal would have caught this. >>>> >>>> Is my concern that your policy-sandbox would need constant updating as >>>> new >>>> browser features were added a fair concern? >>>> >>>> Do you think supporting some kind of HTML policy engine like I'm >>>> suggesting >>>> is valid at all? >>>> >>>> Aloha, >>>> Jim >>>> >>>> >>>> >>>> >>>> On 1/22/16 6:35 PM, David Ross wrote: >>>> >>>> I would not characterize it as blacklist-based, and certainly it's no >>>> more >>>> blacklist-based than a sanitizer. >>>> >>>>> What about CSS expressions and other edge cases not >>>>> described in http://lcamtuf.coredump.cx/postxss/ ? >>>> >>>> It's covered by this policy: >>>> * Disablement of script / active content >>>> >>>> Also, was that a pun? Because CSS expressions are an Edge case. =) >>>> >>>> >>>> On Fri, Jan 22, 2016 at 3:28 PM, Jim Manico <jim.manico@owasp.org> wrote: >>>>> >>>>> Again, I am reading your proposal right now, but this looks a little >>>>> blacklist-ish to me. What about CSS expressions and other edge cases not >>>>> described in http://lcamtuf.coredump.cx/postxss/ ? There more out there >>>>> per >>>>> my understanding.... >>>>> >>>>> This is why I prefer more programatic sanitization is because it's a >>>>> whitelist which tends to be a stronger control. Once a good sanitization >>>>> API >>>>> is built, it will stand the test of time as new browser features are >>>>> added. >>>>> >>>>> An approach just banning bad things will be way more fragile as new >>>>> browser features get added over time. >>>>> >>>>> - Jim >>>>> >>>>> >>>>> On 1/22/16 6:17 PM, David Ross wrote: >>>>>>> >>>>>>> There is a handful of examples where the rigidity basically >>>>>>> ruled out adoption (e.g., MSIE's old <iframe> sandbox). >>>>>> >>>>>> This: https://msdn.microsoft.com/en-us/library/ms534622(v=vs.85).aspx >>>>>> It came in for Hotmail, but it was never put to use AFAIK, exactly for >>>>>> the reason you describe. >>>>>> >>>>>> There is a finite list of "unsafe" things that markup / CSS can do >>>>>> when rendered on a page. (Essential reference, of course: >>>>>> http://lcamtuf.coredump.cx/postxss/) It is possible there are a >>>>>> couple things missing from the initial list of Safe Node policies >>>>>> requiring enforcement. (E.g.: Link targeting is covered but we >>>>>> probably also need a way to regulate navigation more generally.) But >>>>>> the problem is tractable. And I don't think that sanitization baked >>>>>> into the browser provides a better approach in this regard. >>>>>> >>>>>> Another key thing here is that with either a sanitizer or Safe Node, >>>>>> it's important to pick a good set of secure defaults. That way the >>>>>> policy problems Michal described are less likely to occur as custom >>>>>> configuration tends to be minimal. With the sandbox attribute for >>>>>> frames, I think the use cases vary to such an extent that it would >>>>>> have been hard to set secure defaults. E.g.: allow-scripts and >>>>>> allow-same-origin are OK independently, but not when combined. >>>>>> There's no safe default there because there are many use cases for >>>>>> either approach. I don't see that Safe Node policies interfere with >>>>>> each other in this way and so we probably dodged this bullet. >>>>>> >>>>>> Jim said: >>>>>>> >>>>>>> I have an aversion to different policy packages not being >>>>>>> flexible enough to be useful. >>>>>> >>>>>> FWIW, as per earlier in the thread, the Safe Node approach addresses >>>>>> scenarios around CSS where _sanitization_ is inflexible. (Caveat: If >>>>>> a sanitizer is baked into the browser, all of a sudden it can pursue >>>>>> the same approach.) >>>>>> >>>>>>> Perhaps support both of these approaches? HTML >>>>>>> Programmatic sanitization and several pre-built policies? >>>>>>> That would provide both easy of use for some, and deep >>>>>>> flexibility for others. Win win win, and win? >>>>>> >>>>>> My argument is that Safe Node has advantages relative to sanitization >>>>>> baked into the browser. If you can identify a legit use case that >>>>>> Safe Node can't support cleanly, but browser-based sanitization does, >>>>>> I'd probably jump right back on the sanitization bandwagon. I wrote a >>>>>> client-side sanitizer not that long ago and I enjoy working on them. >>>>>> =) >>>>>> >>>>>> Dave >>>>>> >>>>>> On Fri, Jan 22, 2016 at 2:40 PM, Jim Manico <jim.manico@owasp.org> >>>>>> wrote: >>>>>>> >>>>>>> Thank you Michal. I'll give David's proposal a closer read and comment >>>>>>> shortly. >>>>>>> >>>>>>> I remember Microsoft and their AntiXSS library providing an HTML >>>>>>> Sanitizer >>>>>>> API for untrusted HTML input. It was one of the first in any major >>>>>>> language >>>>>>> or framework. The first version was very permissive and useful but >>>>>>> unfortunately was vulnerable to HTML hacking and of course XSS. The >>>>>>> latest >>>>>>> incarnation was fixed to be very secure, but unfortunately was not at >>>>>>> all >>>>>>> useful because it was so restrictive. And MS is now deprecating it >>>>>>> with >>>>>>> no >>>>>>> commitment to maintain it. >>>>>>> >>>>>>> I have an aversion to different policy packages not being flexible >>>>>>> enough to >>>>>>> be useful. But I will give David's proposal a deeper read and provide >>>>>>> comments more specific to his proposal. >>>>>>> >>>>>>> Perhaps support both of these approaches? HTML Programmatic >>>>>>> sanitization >>>>>>> and >>>>>>> several pre-built policies? That would provide both easy of use for >>>>>>> some, >>>>>>> and deep flexibility for others. Win win win, and win? >>>>>>> >>>>>>> Aloha, >>>>>>> Jim >>>>>>> >>>>>>> >>>>>>> >>>>>>> On 1/22/16 5:29 PM, Michal Zalewski wrote: >>>>>>>>> >>>>>>>>> The need to inject untrusted markup into the DOM comes up all the >>>>>>>>> time >>>>>>>>> and >>>>>>>>> is critical (WYSIWYG editors ,etc). But any "safe node" that limits >>>>>>>>> what >>>>>>>>> can >>>>>>>>> render and execute will limit innovation. Each developer needs to >>>>>>>>> support >>>>>>>>> a >>>>>>>>> different markup subset for their app, which is why policy based >>>>>>>>> sanitization is so critical to this use case. >>>>>>>>> >>>>>>>>> Take a look at CAJA JS's sanitizer, Angulars $sanitize, and other >>>>>>>>> JS >>>>>>>>> centric HTML sanitizers. They all allow the developer to set a >>>>>>>>> policy >>>>>>>>> of >>>>>>>>> what tags and attributes should be supported, and all other markup >>>>>>>>> gets >>>>>>>>> stripped out. >>>>>>>>> >>>>>>>>> This is the kind of native defensive pattern we need in JavaScript, >>>>>>>>> IMO! >>>>>>>> >>>>>>>> I think there are interesting trade-offs, and I wouldn't be too quick >>>>>>>> to praise one approach over the other. If you design use-centric >>>>>>>> "policy packages" (akin to what's captured in David's proposal), you >>>>>>>> offer safe and consistent choices to developers. The big unknown is >>>>>>>> whether the policies will be sufficiently flexible and future-proof - >>>>>>>> for example, will there be some next-gen communication app that >>>>>>>> requires a paradigm completely different from discussion forums or >>>>>>>> e-mail? >>>>>>>> >>>>>>>> There is a handful of examples where the rigidity basically ruled out >>>>>>>> adoption (e.g., MSIE's old <iframe> sandbox). >>>>>>>> >>>>>>>> The other alternative is the Lego-style policy building approach >>>>>>>> taken >>>>>>>> with CSP. Out of the countless number of CSP policies you can create, >>>>>>>> most will have inconsistent or self-defeating security properties, >>>>>>>> and >>>>>>>> building watertight ones requires a fair amount of expertise. Indeed, >>>>>>>> most CSP deployments we see today probably don't provide much in term >>>>>>>> of security. But CSP is certainly a lot more flexible and >>>>>>>> future-proof >>>>>>>> than the prepackaged approach. >>>>>>>> >>>>>>>> At the same time treating flexibility as a goal in itself can lead to >>>>>>>> absurd outcomes, too: a logical conclusion is to just provide >>>>>>>> programmatic hooks for flexible, dynamic filtering of markup, instead >>>>>>>> of any static, declarative policies. One frequently-cited approach >>>>>>>> here was Microsoft's Mutation-Event Transforms [1], and I don't think >>>>>>>> it was a step in the right direction (perhaps except as a finicky >>>>>>>> building block for more developer-friendly sanitizers). >>>>>>>> >>>>>>>> [1] >>>>>>>> >>>>>>>> >>>>>>>> http://research.microsoft.com/en-us/um/people/livshits/papers/pdf/hotos07.pdf >>>>>>> >>>>>>> >>>> >>
Received on Saturday, 23 January 2016 08:44:34 UTC