W3C home > Mailing lists > Public > public-webrtc@w3.org > June 2021

Re: Screensharing: Bootstrapping Collaboration between Capturer and Capturee

From: T H Panton <tim@pi.pe>
Date: Tue, 15 Jun 2021 10:42:52 +0100
Message-Id: <FC64AC32-296C-437E-A2CB-746BA4A55157@pi.pe>
Cc: Youenn Fablet <youenn@apple.com>, Jan-Ivar Bruaroey <jib@mozilla.com>, Sergio Garcia Murillo <sergio.garcia.murillo@gmail.com>, Elad Alon <eladalon@google.com>, WebRTC WG <public-webrtc@w3.org>
To: Harald Alvestrand <harald@alvestrand.no>
I think there is a strong solution that _is_ a webRTC matter and we might want to move to in the future.

- We should use a webRTC data channel as the path between the capturer and captured apps.
This has the advantage of being cross browser (in every sense) since it would work if one were using chrome to capture a firefox tab etc.
It is also possible for standalone presentation apps to offer the same feature - if they included one of the free-standing Data-channel stacks.

The way I envisage it working is that the captured tab offers an opaque token (as in Elad’s proposal) to the capturer when the user selects
a window to capture (this would not apply to regions/whole screen captures) - The token does not include any clues about origin (the target may be a native app)
or the ability to filter by provenance. This token could perhaps be an MDNS SRV record for a data channel.

My feeling is that we should move forward with Elad’s proposal - but remove any requirements that preclude this path in future. (eg the origin tests) and explicitly 
include the concept that this feature might be available from captured native windows too.


> On 15 Jun 2021, at 07:07, Harald Alvestrand <harald@alvestrand.no> wrote:
> Embedding the specific UA controls in the browser (proposal A and B) is a layering violation.
> The WebRTC API and the platform does not offer a presentation feature; it offers a platform on which applications can be built, some of which embed the concept of "forward slide" and "back slide".
> If you want to propose creating a complete framework for presentations and presentation management, and propose embedding the whole thing in the browser, you're free to do so, but that's not a WEBRTC matter (it might be a webapps matter), and I do not think it is at all appropriate to prusue such a course.
> Letting a controller component and a controlled component for an application find each other and decide to communicate seems like an appropriate thing for the platform to offer.
> Trying to dictate what they talk about is not appropriate for WEBRTC.
> On 6/14/21 5:16 PM, Youenn Fablet wrote:
>>>> I really like apis when they allow untrusted parties to collaborate securely allowing more freedom to the end user and I think it is a goal  it is worth pursuing. 
>>> Communication channels between two untrusted parties is always something to look closely at, from a security and privacy standpoint.
>>> A safe model is to have the UA in the middle: the UA triggers these actions on behalf of the user, not on behalf of the capturer.
>>> For instance, by presenting UA UI in the capturer page to control capturee through actions, similar to picture-in-picture for VC.
>>> For instance if capturee page is playing a video, it might be convenient from capturer page to pause the capturee video using the play/pause actions.
>>> A further step would be to allow capturer page to blend well with this UA UI.
>>> So I hear 3 directions for presentation control actions:
>>> A: UA triggers these standard actions on behalf of the capturer.
>>> B: UA triggers these standard actions on behalf of the user, not on behalf of the capturer.
>>> C: Out of band between mutually participating properties only, based on id.
>>> I think both A and B sound promising. Youenn, it seems to me A might have slightly better ergonomics, so I'm curious about what risks B might mitigate.
>> A and B are not exclusive, B could be implemented and deployed faster than A.
>> I believe B is the current MediaSession approach and has the same security context.
>> A is an extension to this approach and needs additional scrutiny since now capturer, which is not as trusted as UA, is the one deciding to trigger these actions.
>> For instance, capturer might iterate through all slides while not displaying the captured live stream to the user.
>> Or multiple capturers can interact with the same capturee.
>> This is not to say it is impossible but it probably requires more work.

Received on Tuesday, 15 June 2021 09:43:28 UTC

This archive was generated by hypermail 2.4.0 : Tuesday, 15 June 2021 09:43:41 UTC