- From: David Lin-Shung Huang <linshung.huang@sv.cmu.edu>
- Date: Tue, 15 Oct 2013 15:11:53 -0700
- To: Brad Hill <hillbrad@gmail.com>
- Cc: "public-webappsec@w3.org" <public-webappsec@w3.org>
- Message-ID: <CAGiwpwjibJ43+zm_G02bULnfN0FCZ_6y=GU2Suav16iDdUR8hw@mail.gmail.com>
Regarding WebRTC desktop/screen sharing implementations, I see that screen capturing in chromium is done with OS-native APIs (BitBlt for Windows). https://code.google.com/p/webrtc/source/browse/trunk/webrtc/modules/desktop_capture/screen_capturer_win.cc For chromium, looks like their ScreenCapturer interface will be reusable https://code.google.com/p/chromium/issues/detail?id=180360 On Mon, Oct 14, 2013 at 4:45 PM, Brad Hill <hillbrad@gmail.com> wrote: > > > On Mon, Oct 14, 2013 at 3:38 PM, Brad Hill <hillbrad@gmail.com> wrote: > >> So, there is no way to get the final rendering, even for the compositor >> thread managing the outermost document? :/ You can't read the pixels >> back from the GPU when you know you have a hit to a protected region? >> > > Continuing to explore my own question... does the implementation of the > screen capture facility of getUserMedia() provide a re-usable primitive > that we can point to? > > > >> , 2013 at 6:54 PM, David Lin-Shung Huang <linshung.huang@sv.cmu.edu>wrote: >> >>> Thanks, Brad! For what it's worth, here's my attempt to clarify which >>> parts of the original description in Section 6.2 are affected by >>> compositing: >>> >>> - For "timing attacks countermeasure" (the "Display Change List"), >>> compositing should have no impact. Essentially, we're keeping track of the >>> "damage rects" that already exists in the main thread of WebKit (presumably >>> the browser we're concerned of here). >>> >>> - For "cursor sanity check", compositing has no impact. (Cursors are >>> independent of screenshots.) >>> >>> - For "obstruction check", there are two distinct cases: >>> (1) When the "user image" is taken using OS-native APIs, compositing >>> should have no impact. (OS can grab the final rendering.) >>> (2) When the "user image" is taken by the browser, obstruction checks >>> will fail (as Adam pointed out) since each browsing context has no >>> knowledge of the final rendering. [FIXME!] >>> >>> As a side benefit, I suspect that compositing might actually make >>> obstruction checks faster, because the "control image" could be available >>> as a cached layer (or tile), thus eliminating the need to render an >>> off-screen HTML5 canvas element. >>> >>> >>> Thanks, >>> David >>> >>> >>> On Thu, Oct 10, 2013 at 12:48 PM, Brad Hill <hillbrad@gmail.com> wrote: >>> >>>> <hat = editor> >>>> >>>> One of the open issues on UISecurity raised by Adam Barth is that the >>>> input-protection heuristic is not well-suited to browsers that use >>>> compositing to accelerate page rendering with GPUs. While this heuristic >>>> is non-normative, Adam suggested that we should supply a heuristic for this >>>> model. >>>> >>>> I've attempted to grok these browser internals this week, at least for >>>> Blink and WebKit, and below is a first attempt at such a heuristic. I >>>> would definitely appreciate review from anyone who feels qualified, or is >>>> willing to forward it to someone who is. I'm confident that despite my >>>> best efforts I have somwhere confused what happens in layout vs. draw vs. >>>> paint vs. composite vs. render. >>>> >>>> thanks, >>>> >>>> Brad >>>> >>>> *Alternate Input Protection Heuristic for Multi-Layer Compositing* >>>> >>>> Some user agents, in order to improve performance by taking advantage >>>> of specialized graphics hardware, use a strategy for hit testing and >>>> delivering UI events to hardware-composited layers that the basic heuristic >>>> does not apply well to. This alternative *non-normative* heuristic >>>> describes one possible implementation strategy for the input-protection >>>> directive in this architecture. >>>> >>>> GPU optimized user agents typically separate the browser UI process >>>> from the process that handles building and displaying the visual >>>> representation of the resource. (In this context the term "process" refers >>>> to any encapsulated subunit of user-agent functionality that communicates >>>> to other similar subunits through message passing, without implying any >>>> particular implementation details such as locality to a thread, OS-level >>>> "processe" or the like.) It is typical for the browser UI process to >>>> receive user events such as mouse clicks and then marshal these to the >>>> render process, where the event is hit tested through the page's DOM, >>>> checking for event handlers along the way. As an optimization the render >>>> process may communicate hit test rectangles back to the UI process in >>>> advance so that the UI process can, e.g. immediately respond to a Touch >>>> event by scrolling if the event target falls within coordinates for which >>>> there are no other registered handlers in the DOM. A similar strategy can >>>> be used to create an implementation of the input protection heuristic in a >>>> manner that is consistent with this multi-process, compositing architecture. >>>> >>>> If a resource is being loaded in a frame, iframe, object, embed or >>>> applet context and specifies an input-protection directive, apply the >>>> following steps: >>>> >>>> 1. *Tracking hit test rects: * Hook the creation of event >>>> handlers for protected events and elements and add the DOM nodes with any >>>> such handler to a collection. After a layout occurs, or when an event >>>> handler is added or removed,iterate across all DOM nodes to generate a >>>> vector of rectangles where such events need to be marshaled to. If the >>>> input-protection applies to the DOMWindow or Document node, avoid this >>>> expensive process of walking the renderers and simply use the view's >>>> bounds, as they're guaranteed to be inclusive. >>>> >>>> 2. *(Optionally) Put the protected areas into a backing store / >>>> composited layer: *To avoid the expense of having to re-layout and >>>> re-paint protected regions during the *obstruction check*, it may make >>>> sense to designate and place protected regions into their own backing store >>>> or composited layer which can serve as a cached *control image*. Such >>>> a backing store should paint the entire content of the protected region for >>>> this purpose, even if it is clipped by the viewport. >>>> >>>> 3. *Hit testing in the compositor: *When an event is received, >>>> check whether it is on any layer and then walk the layer hierarchy checking >>>> the protected regions on every layer. If there is a hit, continue the >>>> heuristic. Otherwise, exit this heuristic and event processing proceeds as >>>> normal. >>>> >>>> 4. *Cursor sanity check:* By querying computed-style with the >>>> ":hover" pseudo-class on the element (if the target is plugin content) or >>>> on the host frame element and its ancestors (if the target is a nested >>>> document), check whether the cursor has been hidden or changed to a >>>> possibly attacker-provided bitmap: if it has, proceed to *Violation >>>> management*. This provides protection against "Phantom cursor" >>>> attacks, also known as "Cursorjacking". >>>> >>>> 5. *Timing check: [ I need some help here ] *Conceptually, we >>>> would like to track whenever a protected region must be *redrawn >>>> (--show-property-changed-rects,I think, is the Blink concept) *AND >>>> when the cause of that redraw originated from a different document >>>> context. We want to trigger the heuristic if an enclosing frame overlapped >>>> the protected region within the specified time interval, but we don't want >>>> to trigger if redraws originate from within the same document context as >>>> the protected area. (e.g. if the button itself has a mouseover animation) >>>> I'm really not sure how this part works or how to describe it in generic >>>> terms here. Can we propagate the source and timing of redraw triggers to >>>> the protected hit test rects or our collection of DOM nodes? >>>> >>>> 6. *Obstruction check: *Compare two sets of pixels: the *control >>>> image *is the protected area as if it was rendered alone, unobstructed >>>> by pixels originating from any other document context. (If step 2 >>>> optimizations were performed, this should be readily available in its own >>>> composited layer.) The *user image *represents the same area as the *control >>>> image* in the outermost document's coordinate system and contains the >>>> final set of common pixels for the fully rendered page. These images are >>>> compared, and if the differences are below the *tolerance* threshold >>>> associated with the input-protection directive, proceed to deliver the >>>> event normally, otherwise proceed to *Violation management*. If >>>> portions of the *control image *are clipped by the root view port in >>>> the outermost document's coordinate system, all such pixels must be >>>> considered not to match. [*I don't know enough to say whether the >>>> comparison can be done on the GPU without marshaling the pixels of the >>>> control image backing store back to software, or if this is even worth >>>> mentioning hereā¦]* >>>> >>>> >>>> >>>> Notes: applying protections to the entire document (if it itself would >>>> consist of multiple composited layers) or using the input-protection-clip >>>> property may make many of these optimizations impossible, and may imposes >>>> performance penalties on the page, perhaps forcing it to fallback to >>>> all-software rendering. We should have text warning authors of this, or >>>> should we simply remove those options from the spec and require >>>> input-protection-selectors only, to better match the internal strategies of >>>> modern browser rendering? >>>> >>>> >>>> >>> >>> >> >
Received on Tuesday, 15 October 2013 22:12:23 UTC