Re: [webappsec] ISSUE-53: UISecurity input-protection heuristic for composited rendering from Giorgio Maone on 2013-10-16 (public-webappsec@w3.org from October 2013)

From: Giorgio Maone <g.maone@informaction.com>
Date: Wed, 16 Oct 2013 06:59:14 +0200
To: Brad Hill <hillbrad@gmail.com>, David Lin-Shung Huang <linshung.huang@sv.cmu.edu>
CC: "public-webappsec@w3.org" <public-webappsec@w3.org>
Message-ID: <525E1D22.7070506@informaction.com>
On 16/10/2013 00:16, Brad Hill wrote:
> Thanks, David.  Looks like a lot of functionality there
> (https://code.google.com/p/chromium/issues/detail?id=180360) that
> could be relevant to UISecurity.
Thank you both for your research. Looks like the work above, when/if
completed, would be very helpful to implement the UISecurity heuristic
as currently specified.
I'm looking with special interest at the invalidation and cursor
tracking callbacks.
-- G

>
>
> On Tue, Oct 15, 2013 at 3:11 PM, David Lin-Shung Huang
> <linshung.huang@sv.cmu.edu <mailto:linshung.huang@sv.cmu.edu>> wrote:
>
>     Regarding WebRTC desktop/screen sharing implementations, I see
>     that screen capturing in chromium is done with OS-native APIs
>     (BitBlt for Windows).
>     https://code.google.com/p/webrtc/source/browse/trunk/webrtc/modules/desktop_capture/screen_capturer_win.cc
>
>     For chromium, looks like their ScreenCapturer interface will be
>     reusable
>     https://code.google.com/p/chromium/issues/detail?id=180360
>
>
>
>     On Mon, Oct 14, 2013 at 4:45 PM, Brad Hill <hillbrad@gmail.com
>     <mailto:hillbrad@gmail.com>> wrote:
>
>
>
>         On Mon, Oct 14, 2013 at 3:38 PM, Brad Hill <hillbrad@gmail.com
>         <mailto:hillbrad@gmail.com>> wrote:
>
>             So, there is no way to get the final rendering, even for
>             the compositor thread managing the outermost document?  :/
>               You can't read the pixels back from the GPU when you
>             know you have a hit to a protected region?
>
>
>         Continuing to explore my own question... does the
>         implementation of the screen capture facility of
>         getUserMedia() provide a re-usable primitive that we can point to?
>
>          
>
>             , 2013 at 6:54 PM, David Lin-Shung Huang
>             <linshung.huang@sv.cmu.edu
>             <mailto:linshung.huang@sv.cmu.edu>>wrote:
>
>                 Thanks, Brad! For what it's worth, here's my attempt
>                 to clarify which parts of the original description in
>                 Section 6.2 are affected by compositing:
>
>                 - For "timing attacks countermeasure" (the "Display
>                 Change List"), compositing should have no impact.
>                 Essentially, we're keeping track of the "damage rects"
>                 that already exists in the main thread of WebKit
>                 (presumably the browser we're concerned of here).
>
>                 - For "cursor sanity check", compositing has no
>                 impact. (Cursors are independent of screenshots.)
>
>                 - For "obstruction check", there are two distinct cases:
>                   (1) When the "user image" is taken using OS-native
>                 APIs, compositing should have no impact. (OS can grab
>                 the final rendering.)
>                   (2) When the "user image" is taken by the browser,
>                 obstruction checks will fail (as Adam pointed out)
>                 since each browsing context has no knowledge of the
>                 final rendering. [FIXME!]
>
>                 As a side benefit, I suspect that compositing might
>                 actually make obstruction checks faster, because the
>                 "control image" could be available as a cached layer
>                 (or tile), thus eliminating the need to render an
>                 off-screen HTML5 canvas element.
>
>
>                 Thanks,
>                 David
>
>
>                 On Thu, Oct 10, 2013 at 12:48 PM, Brad Hill
>                 <hillbrad@gmail.com <mailto:hillbrad@gmail.com>> wrote:
>
>                     <hat = editor>
>
>                     One of the open issues on UISecurity raised by
>                     Adam Barth is that the input-protection heuristic
>                     is not well-suited to browsers that use
>                     compositing to accelerate page rendering with
>                     GPUs.  While this heuristic is non-normative, Adam
>                     suggested that we should supply a heuristic for
>                     this model.  
>
>                     I've attempted to grok these browser internals
>                     this week, at least for Blink and WebKit, and
>                     below is a first attempt at such a heuristic.  I
>                     would definitely appreciate review from anyone who
>                     feels qualified, or is willing to forward it to
>                     someone who is.  I'm confident that despite my
>                     best efforts I have somwhere confused what happens
>                     in layout vs. draw vs. paint vs. composite vs. render.
>
>                     thanks,
>
>                     Brad
>
>                     *Alternate Input Protection Heuristic for
>                     Multi-Layer Compositing*
>
>                     Some user agents, in order to improve performance
>                     by taking advantage of specialized graphics
>                     hardware, use a strategy for hit testing and
>                     delivering UI events to hardware-composited layers
>                     that the basic heuristic does not apply well to. 
>                     This alternative /non-normative/ heuristic
>                     describes one possible implementation strategy for
>                     the input-protection directive in this architecture.
>
>                     GPU optimized user agents typically separate the
>                     browser UI process from the process that handles
>                     building and displaying the visual representation
>                     of the resource.  (In this context the term
>                     "process" refers to any encapsulated subunit of
>                     user-agent functionality that communicates to
>                     other similar subunits through message passing,
>                     without implying any particular implementation
>                     details such as locality to a thread, OS-level
>                     "processe" or the like.)  It is typical for the
>                     browser UI process to receive user events such as
>                     mouse clicks and then marshal these to the render
>                     process, where the event is hit tested through the
>                     page's DOM, checking for event handlers along the
>                     way.  As an optimization the render process may
>                     communicate hit test rectangles back to the UI
>                     process in advance so that the UI process can,
>                     e.g. immediately respond to a Touch event by
>                     scrolling if the event target falls within
>                     coordinates for which there are no other
>                     registered handlers in the DOM.   A similar
>                     strategy can be used to create an implementation
>                     of the input protection heuristic in a manner that
>                     is consistent with this multi-process, compositing
>                     architecture.
>
>                     If a resource is being loaded in a frame, iframe,
>                     object, embed or applet context and specifies an
>                     input-protection directive, apply the following steps:
>
>                     1.       *Tracking hit test rects: * Hook the
>                     creation of event handlers for protected events
>                     and elements and add the DOM nodes with any such
>                     handler to a collection. After a layout occurs, or
>                     when an event handler is added or removed,iterate
>                     across all DOM nodes to generate a vector of
>                     rectangles where such events need to be marshaled
>                     to.  If the input-protection applies to the
>                     DOMWindow or Document node, avoid this expensive
>                     process of walking the renderers and simply use
>                     the view's bounds, as they're guaranteed to be
>                     inclusive.  
>
>                     2.       *(Optionally) Put the protected areas
>                     into a backing store / composited layer: *To avoid
>                     the expense of having to re-layout and re-paint
>                     protected regions during the */obstruction
>                     check/*, it may make sense to designate and place
>                     protected regions into their own backing store or
>                     composited layer which can serve as a cached
>                     */control image/*. Such a backing store should
>                     paint the entire content of the protected region
>                     for this purpose, even if it is clipped by the
>                     viewport. 
>
>                     3.       *Hit testing in the compositor: *When an
>                     event is received, check whether it is on any
>                     layer and then walk the layer hierarchy checking
>                     the protected regions on every layer.  If there is
>                     a hit, continue the heuristic.  Otherwise, exit
>                     this heuristic and event processing proceeds as
>                     normal.
>
>                     4.       *Cursor sanity check:* By querying
>                     computed-style with the ":hover" pseudo-class on
>                     the element (if the target is plugin content) or
>                     on the host frame element and its ancestors (if
>                     the target is a nested document), check whether
>                     the cursor has been hidden or changed to a
>                     possibly attacker-provided bitmap: if it has,
>                     proceed to *Violation management*. This provides
>                     protection against "Phantom cursor" attacks, also
>                     known as "Cursorjacking".
>
>                     5.       *Timing check: [ I need some help here ]
>                     *Conceptually, we would like to track whenever a
>                     protected region must be /redrawn
>                     (--show-property-changed-rects,I think, is the
>                     Blink concept) /AND when the cause of that redraw
>                     originated from a different document context.  We
>                     want to trigger the heuristic if an enclosing
>                     frame overlapped the protected region within the
>                     specified time interval, but we don't want to
>                     trigger if redraws originate from within the same
>                     document context as the protected area. (e.g. if
>                     the button itself has a mouseover animation)  I'm
>                     really not sure how this part works or how to
>                     describe it in generic terms here.  Can we
>                     propagate the source and timing of redraw triggers
>                     to the protected hit test rects or our collection
>                     of DOM nodes?
>
>                     6.       *Obstruction check: *Compare two sets of
>                     pixels: the */control image/ *is the protected
>                     area as if it was rendered alone, unobstructed by
>                     pixels originating from any other document
>                     context.  (If step 2 optimizations were performed,
>                     this should be readily available in its own
>                     composited layer.)  The */user image/ *represents
>                     the same area as the */control image/* in the
>                     outermost document's coordinate system and
>                     contains the final set of common pixels for the
>                     fully rendered page. These images are compared,
>                     and if the differences are below the *tolerance*
>                     threshold associated with the input-protection
>                     directive, proceed to deliver the event normally,
>                     otherwise proceed to *Violation management*. If
>                     portions of the */control image/ *are clipped by
>                     the root view port in the outermost document's
>                     coordinate system, all such pixels must be
>                     considered not to match.  [/I don't know enough to
>                     say whether the comparison can be done on the GPU
>                     without marshaling the pixels of the control image
>                     backing store back to software, or if this is even
>                     worth mentioning here…]/
>
>                      
>
>                     Notes: applying protections to the entire document
>                     (if it itself would consist of multiple composited
>                     layers) or using the input-protection-clip
>                     property may make many of these optimizations
>                     impossible, and may imposes performance penalties
>                     on the page, perhaps forcing it to fallback to
>                     all-software rendering.  We should have text
>                     warning authors of this, or should we simply
>                     remove those options from the spec and require
>                     input-protection-selectors only, to better match
>                     the internal strategies of modern browser rendering?
>
>                      
>
>
>
>
>
>
Received on Wednesday, 16 October 2013 04:59:37 UTC