Re: [mediacapture-screen-share-extensions] Consider dropping permission for captured surface control APIs (#14) from Elad Alon via GitHub on 2024-10-25 (public-webrtc-logs@w3.org from October 2024)

From: Elad Alon via GitHub <sysbot+gh@w3.org>
Date: Fri, 25 Oct 2024 13:56:34 +0000
To: public-webrtc-logs@w3.org
Message-ID: <issue_comment.created-2437850738-1729864591-sysbot+gh@w3.org>

> Permissions are necessary when undesirable behaviors are indistinguishable from desirable ones

The established TAG design principle [is](https://w3ctag.github.io/design-principles/#consent): "If a useful feature has the potential to cause harm to users, make sure that the user can give meaningful consent for that feature to be used, and that they can **refuse consent effectively**."

A call to `getDisplayMedia()` invokes a prompt asking the user whether they want to share the currently-visible pixels of another surface. That is an established prompt, and accepting it does not indicate to the browswer any other intention by the user. The way to understand the **user's intention** is to **prompt** them. If screen-share implicitly allows scrolling/zooming, without an additional prompt, then the aforementioned principle is broken.

Prompts might not be perfect, but they are **better than guessing**. It is perfectly spec-compliant for the user agent to (i) augment the permission policy with heuristics, (ii) skip the additional prompt, or (iii) modify the `getDisplayMedia()` prompt. The current spec is flexible enough that user agents have much flexibility in implementing it. This is desirable, and usually **aids consensus formation**.

> Safari usually requires synchronous prompts

Chrome usually uses async prompts.
It is quite useful that the API shape proposed already accommodates both these design philosophies, as well as others.

> > * Emojis on top
>
> is it even needed for the initial MVP?

Yes, it is. The MVP is informed by the real-world requirements of real-world Web developers, and the real-world scenarios they inform us that they need to solve.

> > The MVP to me is to forward user gestures for a video element that is fully displayed and without any other element above it.
>
> It seems implementable without relying on users trusting the website

This claim is unsubstantiated; the counterarguments are available earlier in the thread, which explained that the representation in the video element might not be faithful. ([[1]](https://github.com/w3c/mediacapture-screen-share-extensions/issues/14#issuecomment-2431303180), [[2]](https://github.com/w3c/mediacapture-screen-share-extensions/issues/13#issuecomment-2422712549))

To name another counterargument - malicious sites can get the user to scrolls somewhere, then either:
- Pop a video element where the user was already scrolling.
- Have the video already there, but obscured by another element, then remove the obscuring element.

Reminder - I am NOT saying that the prompt is intended to stop clickjacking (recall [here](https://github.com/w3c/mediacapture-screen-share-extensions/issues/14#issuecomment-2434741283)). Rather, I am saying that the **proposed alternative** of limiting-scrolling-to-video-element is succeptible to clickjacking, and it therefore fails to add any value, let alone can it obviate any security measures.

> I'm open to extending MVP to solving emojis.

The MVP is informed by Web developers' stated needs. As explained, this includes a video completely obscured by a canvas, div or other element on top of which developers introduce whichever other elements.

-- 
GitHub Notification of comment by eladalon1983
Please view or discuss this issue at https://github.com/w3c/mediacapture-screen-share-extensions/issues/14#issuecomment-2437850738 using your GitHub account


-- 
Sent via github-notify-ml as configured in https://github.com/w3c/github-notify-ml-config

Received on Friday, 25 October 2024 13:56:35 UTC