- From: Brad Hill <hillbrad@gmail.com>
- Date: Mon, 14 Oct 2013 16:45:03 -0700
- To: David Lin-Shung Huang <linshung.huang@sv.cmu.edu>
- Cc: "public-webappsec@w3.org" <public-webappsec@w3.org>
- Message-ID: <CAEeYn8iePYRbtinQGNW24LptvUNNNVff5nTuRkKvsBF3maWS_A@mail.gmail.com>
On Mon, Oct 14, 2013 at 3:38 PM, Brad Hill <hillbrad@gmail.com> wrote: > So, there is no way to get the final rendering, even for the compositor > thread managing the outermost document? :/ You can't read the pixels > back from the GPU when you know you have a hit to a protected region? > Continuing to explore my own question... does the implementation of the screen capture facility of getUserMedia() provide a re-usable primitive that we can point to? > , 2013 at 6:54 PM, David Lin-Shung Huang <linshung.huang@sv.cmu.edu>wrote: > >> Thanks, Brad! For what it's worth, here's my attempt to clarify which >> parts of the original description in Section 6.2 are affected by >> compositing: >> >> - For "timing attacks countermeasure" (the "Display Change List"), >> compositing should have no impact. Essentially, we're keeping track of the >> "damage rects" that already exists in the main thread of WebKit (presumably >> the browser we're concerned of here). >> >> - For "cursor sanity check", compositing has no impact. (Cursors are >> independent of screenshots.) >> >> - For "obstruction check", there are two distinct cases: >> (1) When the "user image" is taken using OS-native APIs, compositing >> should have no impact. (OS can grab the final rendering.) >> (2) When the "user image" is taken by the browser, obstruction checks >> will fail (as Adam pointed out) since each browsing context has no >> knowledge of the final rendering. [FIXME!] >> >> As a side benefit, I suspect that compositing might actually make >> obstruction checks faster, because the "control image" could be available >> as a cached layer (or tile), thus eliminating the need to render an >> off-screen HTML5 canvas element. >> >> >> Thanks, >> David >> >> >> On Thu, Oct 10, 2013 at 12:48 PM, Brad Hill <hillbrad@gmail.com> wrote: >> >>> <hat = editor> >>> >>> One of the open issues on UISecurity raised by Adam Barth is that the >>> input-protection heuristic is not well-suited to browsers that use >>> compositing to accelerate page rendering with GPUs. While this heuristic >>> is non-normative, Adam suggested that we should supply a heuristic for this >>> model. >>> >>> I've attempted to grok these browser internals this week, at least for >>> Blink and WebKit, and below is a first attempt at such a heuristic. I >>> would definitely appreciate review from anyone who feels qualified, or is >>> willing to forward it to someone who is. I'm confident that despite my >>> best efforts I have somwhere confused what happens in layout vs. draw vs. >>> paint vs. composite vs. render. >>> >>> thanks, >>> >>> Brad >>> >>> *Alternate Input Protection Heuristic for Multi-Layer Compositing* >>> >>> Some user agents, in order to improve performance by taking advantage of >>> specialized graphics hardware, use a strategy for hit testing and >>> delivering UI events to hardware-composited layers that the basic heuristic >>> does not apply well to. This alternative *non-normative* heuristic >>> describes one possible implementation strategy for the input-protection >>> directive in this architecture. >>> >>> GPU optimized user agents typically separate the browser UI process from >>> the process that handles building and displaying the visual representation >>> of the resource. (In this context the term "process" refers to any >>> encapsulated subunit of user-agent functionality that communicates to other >>> similar subunits through message passing, without implying any particular >>> implementation details such as locality to a thread, OS-level "processe" or >>> the like.) It is typical for the browser UI process to receive user events >>> such as mouse clicks and then marshal these to the render process, where >>> the event is hit tested through the page's DOM, checking for event handlers >>> along the way. As an optimization the render process may communicate hit >>> test rectangles back to the UI process in advance so that the UI process >>> can, e.g. immediately respond to a Touch event by scrolling if the event >>> target falls within coordinates for which there are no other registered >>> handlers in the DOM. A similar strategy can be used to create an >>> implementation of the input protection heuristic in a manner that is >>> consistent with this multi-process, compositing architecture. >>> >>> If a resource is being loaded in a frame, iframe, object, embed or >>> applet context and specifies an input-protection directive, apply the >>> following steps: >>> >>> 1. *Tracking hit test rects: * Hook the creation of event >>> handlers for protected events and elements and add the DOM nodes with any >>> such handler to a collection. After a layout occurs, or when an event >>> handler is added or removed,iterate across all DOM nodes to generate a >>> vector of rectangles where such events need to be marshaled to. If the >>> input-protection applies to the DOMWindow or Document node, avoid this >>> expensive process of walking the renderers and simply use the view's >>> bounds, as they're guaranteed to be inclusive. >>> >>> 2. *(Optionally) Put the protected areas into a backing store / >>> composited layer: *To avoid the expense of having to re-layout and >>> re-paint protected regions during the *obstruction check*, it may make >>> sense to designate and place protected regions into their own backing store >>> or composited layer which can serve as a cached *control image*. Such a >>> backing store should paint the entire content of the protected region for >>> this purpose, even if it is clipped by the viewport. >>> >>> 3. *Hit testing in the compositor: *When an event is received, >>> check whether it is on any layer and then walk the layer hierarchy checking >>> the protected regions on every layer. If there is a hit, continue the >>> heuristic. Otherwise, exit this heuristic and event processing proceeds as >>> normal. >>> >>> 4. *Cursor sanity check:* By querying computed-style with the >>> ":hover" pseudo-class on the element (if the target is plugin content) or >>> on the host frame element and its ancestors (if the target is a nested >>> document), check whether the cursor has been hidden or changed to a >>> possibly attacker-provided bitmap: if it has, proceed to *Violation >>> management*. This provides protection against "Phantom cursor" attacks, >>> also known as "Cursorjacking". >>> >>> 5. *Timing check: [ I need some help here ] *Conceptually, we >>> would like to track whenever a protected region must be *redrawn >>> (--show-property-changed-rects,I think, is the Blink concept) *AND when >>> the cause of that redraw originated from a different document context. We >>> want to trigger the heuristic if an enclosing frame overlapped the >>> protected region within the specified time interval, but we don't want to >>> trigger if redraws originate from within the same document context as the >>> protected area. (e.g. if the button itself has a mouseover animation) I'm >>> really not sure how this part works or how to describe it in generic terms >>> here. Can we propagate the source and timing of redraw triggers to the >>> protected hit test rects or our collection of DOM nodes? >>> >>> 6. *Obstruction check: *Compare two sets of pixels: the *control >>> image *is the protected area as if it was rendered alone, unobstructed >>> by pixels originating from any other document context. (If step 2 >>> optimizations were performed, this should be readily available in its own >>> composited layer.) The *user image *represents the same area as the *control >>> image* in the outermost document's coordinate system and contains the >>> final set of common pixels for the fully rendered page. These images are >>> compared, and if the differences are below the *tolerance* threshold >>> associated with the input-protection directive, proceed to deliver the >>> event normally, otherwise proceed to *Violation management*. If >>> portions of the *control image *are clipped by the root view port in >>> the outermost document's coordinate system, all such pixels must be >>> considered not to match. [*I don't know enough to say whether the >>> comparison can be done on the GPU without marshaling the pixels of the >>> control image backing store back to software, or if this is even worth >>> mentioning here…]* >>> >>> >>> >>> Notes: applying protections to the entire document (if it itself would >>> consist of multiple composited layers) or using the input-protection-clip >>> property may make many of these optimizations impossible, and may imposes >>> performance penalties on the page, perhaps forcing it to fallback to >>> all-software rendering. We should have text warning authors of this, or >>> should we simply remove those options from the spec and require >>> input-protection-selectors only, to better match the internal strategies of >>> modern browser rendering? >>> >>> >>> >> >> >
Received on Monday, 14 October 2013 23:45:32 UTC