Re: [mediacapture-surface-control] Is gesture forwarding tied to capture controller or to MediaStreamTrack or to DOM objects? (#45) from Elad Alon via GitHub on 2025-01-10 (public-webrtc-logs@w3.org from January 2025)

From: Elad Alon via GitHub <sysbot+gh@w3.org>
Date: Fri, 10 Jan 2025 13:48:33 +0000
To: public-webrtc-logs@w3.org
Message-ID: <issue_comment.created-2582752254-1736516910-sysbot+gh@w3.org>

> I also got feedback that use cases might need this overlay to be clickable, so this might require some new CSS feature like maybe .overlay { wheel-events: none }. I'll try to reach out to some CSS folks for comment.

Assume a sample Web app that has a mostly-transparent canvas element overlaid on top of a video element. Scroll events should lead to the captured surface being scrolled, and click events should manipulate something on the canvas (such as annotations).

There exists at least one such application - Google Meet - which proves that this is an interesting pattern that Web developers are actually likely to employ. So this is **not** a purely academic discussion.

Let's examine whether limiting wheel-forwarding to video elements is at odds with this use case. Theoretically speaking, Web applications can make use of `pointer-events: none` to forward **both** scrolls and clicks from the canvas to the video element, then:
* Scroll events are consumed by `forwardWheel()`, which makes the user agent forward these events from the video element to the captured tab.
* Click events trigger an event handler on the video, which then "manually" computes the offset and **reproduces** a synthetic click event at the relevant offset back on the canvas. That is, if the canvas previously invoked foo(x, y), the video element can now do so.

**Possible?** Seems like it. (Modulo limitations we might hear from Web developers.)
**Ergonomic?** No.

We should weigh the hardship this places on Web developers against the security benefits conferred by limiting the API to video elements. Those benefits have not yet been articulated.

> The exact restrictions can remain vague to allow UAs to experiment.

Agreed on this point - **if** we discover Web developers can still use the API if limited to video elements, and if we make this pivot, then we should still leave it to UAs to experiment with heuristics, and revisit specifying additional limitations at a later time.

> await videoElement1.forwardGestures(true);

(The following is feedback on specific issues with the [above proposal](https://github.com/w3c/mediacapture-surface-control/issues/45#issuecomment-2546927355), and should not be misunderstood as endorsement of general thrust of the proposal.)

There is nothing in this shape to tie the video element to the captured surface, whereas the current API shape does. Recall:
```webidl
partial interface CaptureController {
  Promise<undefined> forwardWheel(HTMLElement element);
};
```

It'd be better to just s/`HTMLElement`/`HTMLVideoElement` in the original shape, then to make this change to `HTMLVideoElement.forwardGestures()`.

> it's not possible to forward gestures from two different elements

No Web developer has articulated this requirement, and I don't imagine it to be necessary. **If** we were ever to determine it as necessary, then `HTMLVideoElement.forwardWheel(controller, isOn)` might be more reasonable. (The same caveat of not misreading this message as endorsement still applies.)

-- 
GitHub Notification of comment by eladalon1983
Please view or discuss this issue at https://github.com/w3c/mediacapture-surface-control/issues/45#issuecomment-2582752254 using your GitHub account


-- 
Sent via github-notify-ml as configured in https://github.com/w3c/github-notify-ml-config

Received on Friday, 10 January 2025 13:48:33 UTC