Re: [webappsec] ISSUE-53: UISecurity input-protection heuristic for composited rendering from Brad Hill on 2013-10-15 (public-webappsec@w3.org from October 2013)

From: Brad Hill <hillbrad@gmail.com>
Date: Tue, 15 Oct 2013 15:16:27 -0700
To: David Lin-Shung Huang <linshung.huang@sv.cmu.edu>
Cc: "public-webappsec@w3.org" <public-webappsec@w3.org>
Message-ID: <CAEeYn8jDtdv=MjYQqvy_0T6MU25KckSvhzUSKmH0xxAL=DbRUQ@mail.gmail.com>
Thanks, David.  Looks like a lot of functionality there (
https://code.google.com/p/chromium/issues/detail?id=180360) that could be
relevant to UISecurity.


On Tue, Oct 15, 2013 at 3:11 PM, David Lin-Shung Huang <
linshung.huang@sv.cmu.edu> wrote:

> Regarding WebRTC desktop/screen sharing implementations, I see that screen
> capturing in chromium is done with OS-native APIs (BitBlt for Windows).
>
> https://code.google.com/p/webrtc/source/browse/trunk/webrtc/modules/desktop_capture/screen_capturer_win.cc
>
> For chromium, looks like their ScreenCapturer interface will be reusable
> https://code.google.com/p/chromium/issues/detail?id=180360
>
>
>
> On Mon, Oct 14, 2013 at 4:45 PM, Brad Hill <hillbrad@gmail.com> wrote:
>
>>
>>
>> On Mon, Oct 14, 2013 at 3:38 PM, Brad Hill <hillbrad@gmail.com> wrote:
>>
>>> So, there is no way to get the final rendering, even for the compositor
>>> thread managing the outermost document?  :/   You can't read the pixels
>>> back from the GPU when you know you have a hit to a protected region?
>>>
>>
>> Continuing to explore my own question... does the implementation of the
>> screen capture facility of getUserMedia() provide a re-usable primitive
>> that we can point to?
>>
>>
>>
>>> , 2013 at 6:54 PM, David Lin-Shung Huang <linshung.huang@sv.cmu.edu>wrote:
>>>
>>>> Thanks, Brad! For what it's worth, here's my attempt to clarify which
>>>> parts of the original description in Section 6.2 are affected by
>>>> compositing:
>>>>
>>>> - For "timing attacks countermeasure" (the "Display Change List"),
>>>> compositing should have no impact. Essentially, we're keeping track of the
>>>> "damage rects" that already exists in the main thread of WebKit (presumably
>>>> the browser we're concerned of here).
>>>>
>>>> - For "cursor sanity check", compositing has no impact. (Cursors are
>>>> independent of screenshots.)
>>>>
>>>> - For "obstruction check", there are two distinct cases:
>>>>   (1) When the "user image" is taken using OS-native APIs, compositing
>>>> should have no impact. (OS can grab the final rendering.)
>>>>   (2) When the "user image" is taken by the browser, obstruction checks
>>>> will fail (as Adam pointed out) since each browsing context has no
>>>> knowledge of the final rendering. [FIXME!]
>>>>
>>>> As a side benefit, I suspect that compositing might actually make
>>>> obstruction checks faster, because the "control image" could be available
>>>> as a cached layer (or tile), thus eliminating the need to render an
>>>> off-screen HTML5 canvas element.
>>>>
>>>>
>>>> Thanks,
>>>> David
>>>>
>>>>
>>>> On Thu, Oct 10, 2013 at 12:48 PM, Brad Hill <hillbrad@gmail.com> wrote:
>>>>
>>>>> <hat = editor>
>>>>>
>>>>> One of the open issues on UISecurity raised by Adam Barth is that the
>>>>> input-protection heuristic is not well-suited to browsers that use
>>>>> compositing to accelerate page rendering with GPUs.  While this heuristic
>>>>> is non-normative, Adam suggested that we should supply a heuristic for this
>>>>> model.
>>>>>
>>>>> I've attempted to grok these browser internals this week, at least for
>>>>> Blink and WebKit, and below is a first attempt at such a heuristic.  I
>>>>> would definitely appreciate review from anyone who feels qualified, or is
>>>>> willing to forward it to someone who is.  I'm confident that despite my
>>>>> best efforts I have somwhere confused what happens in layout vs. draw vs.
>>>>> paint vs. composite vs. render.
>>>>>
>>>>> thanks,
>>>>>
>>>>> Brad
>>>>>
>>>>> *Alternate Input Protection Heuristic for Multi-Layer Compositing*
>>>>>
>>>>> Some user agents, in order to improve performance by taking advantage
>>>>> of specialized graphics hardware, use a strategy for hit testing and
>>>>> delivering UI events to hardware-composited layers that the basic heuristic
>>>>> does not apply well to.  This alternative *non-normative* heuristic
>>>>> describes one possible implementation strategy for the input-protection
>>>>> directive in this architecture.
>>>>>
>>>>> GPU optimized user agents typically separate the browser UI process
>>>>> from the process that handles building and displaying the visual
>>>>> representation of the resource.  (In this context the term "process" refers
>>>>> to any encapsulated subunit of user-agent functionality that communicates
>>>>> to other similar subunits through message passing, without implying any
>>>>> particular implementation details such as locality to a thread, OS-level
>>>>> "processe" or the like.)  It is typical for the browser UI process to
>>>>> receive user events such as mouse clicks and then marshal these to the
>>>>> render process, where the event is hit tested through the page's DOM,
>>>>> checking for event handlers along the way.  As an optimization the render
>>>>> process may communicate hit test rectangles back to the UI process in
>>>>> advance so that the UI process can, e.g. immediately respond to a Touch
>>>>> event by scrolling if the event target falls within coordinates for which
>>>>> there are no other registered handlers in the DOM.   A similar strategy can
>>>>> be used to create an implementation of the input protection heuristic in a
>>>>> manner that is consistent with this multi-process, compositing architecture.
>>>>>
>>>>> If a resource is being loaded in a frame, iframe, object, embed or
>>>>> applet context and specifies an input-protection directive, apply the
>>>>> following steps:
>>>>>
>>>>> 1.       *Tracking hit test rects: * Hook the creation of event
>>>>> handlers for protected events and elements and add the DOM nodes with any
>>>>> such handler to a collection. After a layout occurs, or when an event
>>>>> handler is added or removed,iterate across all DOM nodes to generate a
>>>>> vector of rectangles where such events need to be marshaled to.  If the
>>>>> input-protection applies to the DOMWindow or Document node, avoid this
>>>>> expensive process of walking the renderers and simply use the view's
>>>>> bounds, as they're guaranteed to be inclusive.
>>>>>
>>>>> 2.       *(Optionally) Put the protected areas into a backing store /
>>>>> composited layer: *To avoid the expense of having to re-layout and
>>>>> re-paint protected regions during the *obstruction check*, it may
>>>>> make sense to designate and place protected regions into their own backing
>>>>> store or composited layer which can serve as a cached *control image*.
>>>>> Such a backing store should paint the entire content of the protected
>>>>> region for this purpose, even if it is clipped by the viewport.
>>>>>
>>>>> 3.       *Hit testing in the compositor: *When an event is received,
>>>>> check whether it is on any layer and then walk the layer hierarchy checking
>>>>> the protected regions on every layer.  If there is a hit, continue the
>>>>> heuristic.  Otherwise, exit this heuristic and event processing proceeds as
>>>>> normal.
>>>>>
>>>>> 4.       *Cursor sanity check:* By querying computed-style with the
>>>>> ":hover" pseudo-class on the element (if the target is plugin content) or
>>>>> on the host frame element and its ancestors (if the target is a nested
>>>>> document), check whether the cursor has been hidden or changed to a
>>>>> possibly attacker-provided bitmap: if it has, proceed to *Violation
>>>>> management*. This provides protection against "Phantom cursor"
>>>>> attacks, also known as "Cursorjacking".
>>>>>
>>>>> 5.       *Timing check: [ I need some help here ] *Conceptually, we
>>>>> would like to track whenever a protected region must be *redrawn
>>>>> (--show-property-changed-rects,I think, is the Blink concept) *AND
>>>>> when the cause of that redraw originated from a different document
>>>>> context.  We want to trigger the heuristic if an enclosing frame overlapped
>>>>> the protected region within the specified time interval, but we don't want
>>>>> to trigger if redraws originate from within the same document context as
>>>>> the protected area. (e.g. if the button itself has a mouseover animation)
>>>>> I'm really not sure how this part works or how to describe it in generic
>>>>> terms here.  Can we propagate the source and timing of redraw triggers to
>>>>> the protected hit test rects or our collection of DOM nodes?
>>>>>
>>>>> 6.       *Obstruction check: *Compare two sets of pixels: the *control
>>>>> image *is the protected area as if it was rendered alone,
>>>>> unobstructed by pixels originating from any other document context.  (If
>>>>> step 2 optimizations were performed, this should be readily available in
>>>>> its own composited layer.)  The *user image *represents the same area
>>>>> as the *control image* in the outermost document's coordinate system
>>>>> and contains the final set of common pixels for the fully rendered page.
>>>>> These images are compared, and if the differences are below the *
>>>>> tolerance* threshold associated with the input-protection directive,
>>>>> proceed to deliver the event normally, otherwise proceed to *Violation
>>>>> management*. If portions of the *control image *are clipped by the
>>>>> root view port in the outermost document's coordinate system, all such
>>>>> pixels must be considered not to match.  [*I don't know enough to say
>>>>> whether the comparison can be done on the GPU without marshaling the pixels
>>>>> of the control image backing store back to software, or if this is even
>>>>> worth mentioning here…]*
>>>>>
>>>>>
>>>>>
>>>>> Notes: applying protections to the entire document (if it itself would
>>>>> consist of multiple composited layers) or using the input-protection-clip
>>>>> property may make many of these optimizations impossible, and may imposes
>>>>> performance penalties on the page, perhaps forcing it to fallback to
>>>>> all-software rendering.  We should have text warning authors of this, or
>>>>> should we simply remove those options from the spec and require
>>>>> input-protection-selectors only, to better match the internal strategies of
>>>>> modern browser rendering?
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>
>>
>
Received on Tuesday, 15 October 2013 22:16:56 UTC