Re: [webappsec] ISSUE-53: UISecurity input-protection heuristic for composited rendering from Brad Hill on 2013-10-14 (public-webappsec@w3.org from October 2013)

From: Brad Hill <hillbrad@gmail.com>
Date: Mon, 14 Oct 2013 15:38:37 -0700
To: David Lin-Shung Huang <linshung.huang@sv.cmu.edu>
Cc: "public-webappsec@w3.org" <public-webappsec@w3.org>
Message-ID: <CAEeYn8gR4+P0tj+MxBiAJxVxmSatpbvXE0J8T1Ugn+8az7H1FQ@mail.gmail.com>
So, there is no way to get the final rendering, even for the compositor
thread managing the outermost document?  :/   You can't read the pixels
back from the GPU when you know you have a hit to a protected region?

Also:  thoughts on whether we should keep the clipping rectangle around the
hit, or just allow element selectors only?


On Thu, Oct 10, 2013 at 6:54 PM, David Lin-Shung Huang <
linshung.huang@sv.cmu.edu> wrote:

> Thanks, Brad! For what it's worth, here's my attempt to clarify which
> parts of the original description in Section 6.2 are affected by
> compositing:
>
> - For "timing attacks countermeasure" (the "Display Change List"),
> compositing should have no impact. Essentially, we're keeping track of the
> "damage rects" that already exists in the main thread of WebKit (presumably
> the browser we're concerned of here).
>
> - For "cursor sanity check", compositing has no impact. (Cursors are
> independent of screenshots.)
>
> - For "obstruction check", there are two distinct cases:
>   (1) When the "user image" is taken using OS-native APIs, compositing
> should have no impact. (OS can grab the final rendering.)
>   (2) When the "user image" is taken by the browser, obstruction checks
> will fail (as Adam pointed out) since each browsing context has no
> knowledge of the final rendering. [FIXME!]
>
> As a side benefit, I suspect that compositing might actually make
> obstruction checks faster, because the "control image" could be available
> as a cached layer (or tile), thus eliminating the need to render an
> off-screen HTML5 canvas element.
>
>
> Thanks,
> David
>
>
> On Thu, Oct 10, 2013 at 12:48 PM, Brad Hill <hillbrad@gmail.com> wrote:
>
>> <hat = editor>
>>
>> One of the open issues on UISecurity raised by Adam Barth is that the
>> input-protection heuristic is not well-suited to browsers that use
>> compositing to accelerate page rendering with GPUs.  While this heuristic
>> is non-normative, Adam suggested that we should supply a heuristic for this
>> model.
>>
>> I've attempted to grok these browser internals this week, at least for
>> Blink and WebKit, and below is a first attempt at such a heuristic.  I
>> would definitely appreciate review from anyone who feels qualified, or is
>> willing to forward it to someone who is.  I'm confident that despite my
>> best efforts I have somwhere confused what happens in layout vs. draw vs.
>> paint vs. composite vs. render.
>>
>> thanks,
>>
>> Brad
>>
>> *Alternate Input Protection Heuristic for Multi-Layer Compositing*
>>
>> Some user agents, in order to improve performance by taking advantage of
>> specialized graphics hardware, use a strategy for hit testing and
>> delivering UI events to hardware-composited layers that the basic heuristic
>> does not apply well to.  This alternative *non-normative* heuristic
>> describes one possible implementation strategy for the input-protection
>> directive in this architecture.
>>
>> GPU optimized user agents typically separate the browser UI process from
>> the process that handles building and displaying the visual representation
>> of the resource.  (In this context the term "process" refers to any
>> encapsulated subunit of user-agent functionality that communicates to other
>> similar subunits through message passing, without implying any particular
>> implementation details such as locality to a thread, OS-level "processe" or
>> the like.)  It is typical for the browser UI process to receive user events
>> such as mouse clicks and then marshal these to the render process, where
>> the event is hit tested through the page's DOM, checking for event handlers
>> along the way.  As an optimization the render process may communicate hit
>> test rectangles back to the UI process in advance so that the UI process
>> can, e.g. immediately respond to a Touch event by scrolling if the event
>> target falls within coordinates for which there are no other registered
>> handlers in the DOM.   A similar strategy can be used to create an
>> implementation of the input protection heuristic in a manner that is
>> consistent with this multi-process, compositing architecture.
>>
>> If a resource is being loaded in a frame, iframe, object, embed or applet
>> context and specifies an input-protection directive, apply the following
>> steps:
>>
>> 1.       *Tracking hit test rects: * Hook the creation of event handlers
>> for protected events and elements and add the DOM nodes with any such
>> handler to a collection. After a layout occurs, or when an event handler is
>> added or removed,iterate across all DOM nodes to generate a vector of
>> rectangles where such events need to be marshaled to.  If the
>> input-protection applies to the DOMWindow or Document node, avoid this
>> expensive process of walking the renderers and simply use the view's
>> bounds, as they're guaranteed to be inclusive.
>>
>> 2.       *(Optionally) Put the protected areas into a backing store /
>> composited layer: *To avoid the expense of having to re-layout and
>> re-paint protected regions during the *obstruction check*, it may make
>> sense to designate and place protected regions into their own backing store
>> or composited layer which can serve as a cached *control image*. Such a
>> backing store should paint the entire content of the protected region for
>> this purpose, even if it is clipped by the viewport.
>>
>> 3.       *Hit testing in the compositor: *When an event is received,
>> check whether it is on any layer and then walk the layer hierarchy checking
>> the protected regions on every layer.  If there is a hit, continue the
>> heuristic.  Otherwise, exit this heuristic and event processing proceeds as
>> normal.
>>
>> 4.       *Cursor sanity check:* By querying computed-style with the
>> ":hover" pseudo-class on the element (if the target is plugin content) or
>> on the host frame element and its ancestors (if the target is a nested
>> document), check whether the cursor has been hidden or changed to a
>> possibly attacker-provided bitmap: if it has, proceed to *Violation
>> management*. This provides protection against "Phantom cursor" attacks,
>> also known as "Cursorjacking".
>>
>> 5.       *Timing check: [ I need some help here ] *Conceptually, we
>> would like to track whenever a protected region must be *redrawn
>> (--show-property-changed-rects,I think, is the Blink concept) *AND when
>> the cause of that redraw originated from a different document context.  We
>> want to trigger the heuristic if an enclosing frame overlapped the
>> protected region within the specified time interval, but we don't want to
>> trigger if redraws originate from within the same document context as the
>> protected area. (e.g. if the button itself has a mouseover animation)  I'm
>> really not sure how this part works or how to describe it in generic terms
>> here.  Can we propagate the source and timing of redraw triggers to the
>> protected hit test rects or our collection of DOM nodes?
>>
>> 6.       *Obstruction check: *Compare two sets of pixels: the *control
>> image *is the protected area as if it was rendered alone, unobstructed
>> by pixels originating from any other document context.  (If step 2
>> optimizations were performed, this should be readily available in its own
>> composited layer.)  The *user image *represents the same area as the *control
>> image* in the outermost document's coordinate system and contains the
>> final set of common pixels for the fully rendered page. These images are
>> compared, and if the differences are below the *tolerance* threshold
>> associated with the input-protection directive, proceed to deliver the
>> event normally, otherwise proceed to *Violation management*. If portions
>> of the *control image *are clipped by the root view port in the
>> outermost document's coordinate system, all such pixels must be considered
>> not to match.  [*I don't know enough to say whether the comparison can
>> be done on the GPU without marshaling the pixels of the control image
>> backing store back to software, or if this is even worth mentioning here…]
>> *
>>
>>
>>
>> Notes: applying protections to the entire document (if it itself would
>> consist of multiple composited layers) or using the input-protection-clip
>> property may make many of these optimizations impossible, and may imposes
>> performance penalties on the page, perhaps forcing it to fallback to
>> all-software rendering.  We should have text warning authors of this, or
>> should we simply remove those options from the spec and require
>> input-protection-selectors only, to better match the internal strategies of
>> modern browser rendering?
>>
>>
>>
>
>
Received on Monday, 14 October 2013 22:39:05 UTC