- From: Brad Hill <hillbrad@gmail.com>
- Date: Thu, 10 Oct 2013 14:48:18 -0500
- To: "public-webappsec@w3.org" <public-webappsec@w3.org>
- Message-ID: <CAEeYn8i+BBs4bpXC1Fj5RD7xcJfGB2vvKBCHyemPuzMdRaS6TQ@mail.gmail.com>
<hat = editor> One of the open issues on UISecurity raised by Adam Barth is that the input-protection heuristic is not well-suited to browsers that use compositing to accelerate page rendering with GPUs. While this heuristic is non-normative, Adam suggested that we should supply a heuristic for this model. I've attempted to grok these browser internals this week, at least for Blink and WebKit, and below is a first attempt at such a heuristic. I would definitely appreciate review from anyone who feels qualified, or is willing to forward it to someone who is. I'm confident that despite my best efforts I have somwhere confused what happens in layout vs. draw vs. paint vs. composite vs. render. thanks, Brad *Alternate Input Protection Heuristic for Multi-Layer Compositing* Some user agents, in order to improve performance by taking advantage of specialized graphics hardware, use a strategy for hit testing and delivering UI events to hardware-composited layers that the basic heuristic does not apply well to. This alternative *non-normative* heuristic describes one possible implementation strategy for the input-protection directive in this architecture. GPU optimized user agents typically separate the browser UI process from the process that handles building and displaying the visual representation of the resource. (In this context the term "process" refers to any encapsulated subunit of user-agent functionality that communicates to other similar subunits through message passing, without implying any particular implementation details such as locality to a thread, OS-level "processe" or the like.) It is typical for the browser UI process to receive user events such as mouse clicks and then marshal these to the render process, where the event is hit tested through the page's DOM, checking for event handlers along the way. As an optimization the render process may communicate hit test rectangles back to the UI process in advance so that the UI process can, e.g. immediately respond to a Touch event by scrolling if the event target falls within coordinates for which there are no other registered handlers in the DOM. A similar strategy can be used to create an implementation of the input protection heuristic in a manner that is consistent with this multi-process, compositing architecture. If a resource is being loaded in a frame, iframe, object, embed or applet context and specifies an input-protection directive, apply the following steps: 1. *Tracking hit test rects: * Hook the creation of event handlers for protected events and elements and add the DOM nodes with any such handler to a collection. After a layout occurs, or when an event handler is added or removed,iterate across all DOM nodes to generate a vector of rectangles where such events need to be marshaled to. If the input-protection applies to the DOMWindow or Document node, avoid this expensive process of walking the renderers and simply use the view's bounds, as they're guaranteed to be inclusive. 2. *(Optionally) Put the protected areas into a backing store / composited layer: *To avoid the expense of having to re-layout and re-paint protected regions during the *obstruction check*, it may make sense to designate and place protected regions into their own backing store or composited layer which can serve as a cached *control image*. Such a backing store should paint the entire content of the protected region for this purpose, even if it is clipped by the viewport. 3. *Hit testing in the compositor: *When an event is received, check whether it is on any layer and then walk the layer hierarchy checking the protected regions on every layer. If there is a hit, continue the heuristic. Otherwise, exit this heuristic and event processing proceeds as normal. 4. *Cursor sanity check:* By querying computed-style with the ":hover" pseudo-class on the element (if the target is plugin content) or on the host frame element and its ancestors (if the target is a nested document), check whether the cursor has been hidden or changed to a possibly attacker-provided bitmap: if it has, proceed to *Violation management*. This provides protection against "Phantom cursor" attacks, also known as "Cursorjacking". 5. *Timing check: [ I need some help here ] *Conceptually, we would like to track whenever a protected region must be *redrawn (--show-property-changed-rects,I think, is the Blink concept) *AND when the cause of that redraw originated from a different document context. We want to trigger the heuristic if an enclosing frame overlapped the protected region within the specified time interval, but we don't want to trigger if redraws originate from within the same document context as the protected area. (e.g. if the button itself has a mouseover animation) I'm really not sure how this part works or how to describe it in generic terms here. Can we propagate the source and timing of redraw triggers to the protected hit test rects or our collection of DOM nodes? 6. *Obstruction check: *Compare two sets of pixels: the *control image *is the protected area as if it was rendered alone, unobstructed by pixels originating from any other document context. (If step 2 optimizations were performed, this should be readily available in its own composited layer.) The *user image *represents the same area as the *control image* in the outermost document's coordinate system and contains the final set of common pixels for the fully rendered page. These images are compared, and if the differences are below the *tolerance* threshold associated with the input-protection directive, proceed to deliver the event normally, otherwise proceed to *Violation management*. If portions of the *control image *are clipped by the root view port in the outermost document's coordinate system, all such pixels must be considered not to match. [*I don't know enough to say whether the comparison can be done on the GPU without marshaling the pixels of the control image backing store back to software, or if this is even worth mentioning here…]* Notes: applying protections to the entire document (if it itself would consist of multiple composited layers) or using the input-protection-clip property may make many of these optimizations impossible, and may imposes performance penalties on the page, perhaps forcing it to fallback to all-software rendering. We should have text warning authors of this, or should we simply remove those options from the spec and require input-protection-selectors only, to better match the internal strategies of modern browser rendering?
Received on Thursday, 10 October 2013 19:48:46 UTC