W3C home > Mailing lists > Public > public-webrtc@w3.org > June 2021

Re: Screensharing: Bootstrapping Collaboration between Capturer and Capturee

From: T H Panton <tim@pi.pe>
Date: Wed, 16 Jun 2021 11:30:43 +0100
Message-Id: <879887EB-0177-4AE7-B937-8D02EBF19A96@pi.pe>
Cc: Harald Alvestrand <harald@alvestrand.no>, Sergio Garcia Murillo <sergio.garcia.murillo@gmail.com>, Youenn Fablet <youenn@apple.com>, Jan-Ivar Bruaroey <jib@mozilla.com>, WebRTC WG <public-webrtc@w3.org>
To: Elad Alon <eladalon@google.com>
Ha, I see the problem now, I’ve got comfortable in a flat-unique token-namespace and forgotten how ugly the outside world is!

I’m afraid this tells me that the idea of just having a token doesn’t work - we need to define a namespace to assign meaning to it.
I have a couple of possibles in mind. The simplest being MDNS - you look up the token in MDNS and retrieve an SRV which tells you how to connect.

Adding an origin only helps if you have a very specific solution in mind. i.e. server mediated remote control of a well known web app from the same vendor that happens to be on different origins. 
Which does not allow you to control a local native app (e.g. keynote) which seems to me very necessary. It also doesn’t allow a native capture app to control a web page (e.g. slideshare), which is also desirable.
Adding an origin also provides a hook for uncompetitive practice.

If we are going to limit communication to ’same browser instance’ then we would be better off with a port we can invoke postMessage() on.

Generally I’m in favour of “as simple as possible” - however Einstein did add “but no simpler.” 
It seems to me the token/origin solution is too simple - it solves small part of the problem and leaves a slightly smaller new problem.


> On 15 Jun 2021, at 22:23, Elad Alon <eladalon@google.com> wrote:
> Consider the alternative.
> Different apps have different ID spaces. 0x1234567890abcdef might be a valid ID on multiple services. Sometimes even for the same overarching company. When a captured app claims to be 0x1234567890abcdef, is that a Vimeo video, a Google Doc, a Google Slides deck, a Microsoft Word session, a CodePen...? And the list goes on.
> What's left for the app to do? I see two mutually-non-exclusive options:
> 1. Prefix the session ID with an identifier. E.g. MsWord:0x1234567890abcdef. This is vulnerable to either spoofing or unintended clashes, though. "Works sometimes." Better marry that with #2.
> 2. Verify 0x1234567890abcdef on some shared cloud infrastructure. "0x1234567890abcdef, you say? Let's have some remote challenge to see if you're who you really claim you are." This will take at least an RTT, though...
> You escape this conundrum if you allow UA-mediated origin-exposure. (Optional origin exposure, btw, which means that you can send out an opaque token if you need to.)
> On Tue, Jun 15, 2021 at 9:49 PM Tim Panton <tim@pi.pe <mailto:tim@pi.pe>> wrote:
> > On 15 Jun 2021, at 20:37, Harald Alvestrand <harald@alvestrand.no <mailto:harald@alvestrand.no>> wrote:
> > 
> > The point of the origin is that the UA vouches for it's authenticity.
> > 
> > The token is just a string. As long as you can pass a string, the app can choose to pass anything: numbers, tokens, or jsonified objects. App's business, not UA business.
> I still don’t understand why the origin is relevant - apart from enabling a page to engage in uncompetitive behaviour.
> The token is only exchanged when a user has authorised a capture and both sides agree that the captured page is suitable for remote control.
> So why would either side care what the origin is?
> T.

Received on Wednesday, 16 June 2021 10:31:25 UTC

This archive was generated by hypermail 2.4.0 : Wednesday, 16 June 2021 10:31:37 UTC