[mediacapture-screen-share] API for Screenshot (#160)

eladalon1983 has just created a new issue for https://github.com/w3c/mediacapture-screen-share:

== API for Screenshot ==
## Problem Statement
Some apps require a mechanism for taking a screenshot of the page. The canonical use-case is a mechanism that allows an end-user to provide feedback about defects to the app. (Note: Feedback to the app, not browser, but potentially about behavior only exhibited on a given browser at a given edge-case.)

Desirable attributes of this mechanism include:
* **Accurate**: Capture what the user is seeing, not an approximation thereof.
* **Secure**: The user should be able to give informed consent to the capture, probably with a preview of what was captured.
* **Ergonomic**: The user should not be able to choose the wrong thing. Extensions should not be required (discussed below).
* **Efficient**: Long processing times by the app are undesirable.
* **Economic**: When it comes to app size, less is more. Not requiring a JS library is nice.

## State of the Art
There is no current API for grabbing screenshots.

**Existing workarounds include:**
* Redrawing the DOM onto a canvas, then grabbing an image off of the canvas.
  *  Inaccurate, especially with respect to grabbing cross-origin content (usually impossible). But even for same-origin content, it is not guaranteed that the user's issue will still be observable once content is redrawn.
  * Inefficient and uneconomic.
* Using `getDisplayMedia`.
  * Insecure - encourages the user to overshare; sharing can exceed the single intended frame by quite a lot of time, and the user might not realize (due to a lack of understanding of app/browser separation).
  * Unergonomic - for both the user as well as the app. The user has to choose between multiple sources rather than just approve a single frame; the app has to worry that the user might share the wrong thing. (This also creates additional legal liability for the app over user sharing wrong thing and getting is stored by the app's back-end.)
* Extensions.
  * Insecure - could require more permissions than intended. (E.g. could grab a video at any time, rather than a screenshot at user-approved times.)
  * Unergonomic - installing an app just to take a single screenshot is problematic.
* Requiring the user to manually grab a screenshot and upload it.
  * Unergonomic - grabbing screenshots is beyond some users' technical abilities, and even savvy users find it a needless hassle.

**Interest in this feature is showcased by:**
1. Existence and adoption of JS libraries to support the work-around via the redraw-onto-canvas workaround (e.g. [html2canvas](https://github.com/niklasvh/html2canvas)).
2. Known popular apps that provide this functionality via a work-around. (Most consumer-facing Google products; e.g. Meet, Docs, Slides. Search for "help X improve" or "send feedback" on any of these applications.)
3. Previous discussions this has previously spawned in this WG (e.g. issue #107, issue #145).

## Suggested Solution
* Introduce a new API for capturing a screenshot. The API will return a promise that resolves with either a rejection or with the captured image.
* The captured image is shown to the user before it's handed over to the application.
  * The user can reject sharing the captured image with the application.
  * The user can manually crop the image (in the browser, prior to handing it to the application).
  * The user can black-out certain parts of the image (in the browser, prior to handing it to the application).

## Security Concerns
* This allows bypassing origin-isolation. However, we already allow this with `getDisplayMedia` (and soon `getViewportMedia`); this new API will provide a less dangerous version of gVM, because (a) the captured image is presented to the user for approval, and (b) capture does not proceed in the background. Requiring mitigations similar to those of gVM could be discussed. (I hope we'll be able to use less, though, but we can start out conservative.)
* There is some concern that the user might not be able to visually inspect all that they're approving, via careful manipulation of cross-origin content's opacity, rendering it imperceptible to the user, but clearly readable by a machine. I believe this to be a minor concern, as the highly user-driven nature of this API makes it unlikely that the user would approve capture by a malicious site. Also, this is a lower concern than with gDM/gVM, where multiple frames can be captured, even long after the user forgets that they had approved a capture.

## TBD Details
Everything is open to discussion, but I would like to especially invite discussion over some points:
1. Maybe the user-agent MAY/SHOULD/MUST allow the user to manually re-grab a new snapshot. This makes the process even more user-driven.
2. How do we prevent spamming the user with requests on the one hand, while not blocking the user from rejecting a snapshot only to grab a new one after interacting with the page again? Imagine the user rejects to adjust something, then asks again. I have two ideas - throttle if the user clicks "don't ask again for this page," or throttle automatically, but show the user where in the browser they could re-enable asking again. Might be out-of-scope for the WG to mandate these, but knowing that UAs have options beyond the spec would give us confidence in our decision to ignore this issue in the spec.

## Potential Compromises
Should we find ourselves struggling to reach consensus, I think we could consider providing a partial solution, then iterating. Namely:
* Can initially limit the API to single-origin applications. (Open to MAY/SHOULD/MUST.)
* Can initially mandate black-out cross-origin iframes and/or resources. (Open to MAY/SHOULD/MUST.)

Please view or discuss this issue at https://github.com/w3c/mediacapture-screen-share/issues/160 using your GitHub account


-- 
Sent via github-notify-ml as configured in https://github.com/w3c/github-notify-ml-config

Received on Tuesday, 30 March 2021 12:45:10 UTC