- From: Ian Hickson <ian@hixie.ch>
- Date: Tue, 13 Jul 2010 23:49:52 +0000 (UTC)
Several people have asked me to elaborate on the vision I have for the <device> element and the ConnectionPeer interface which are currently specified in draft form here: http://www.whatwg.org/specs/web-apps/current-work/complete/commands.html#devices http://www.whatwg.org/specs/web-apps/current-work/complete/commands.html#connectionpeer There are several use cases that are informing this work: - Video conferencing - Real-time games - Peer-to-peer file transfer For video conferencing, we need a mechanism through which a local camera and microphone setup can be enabled and access to which can be granted to scripts in the page, but only with the user's consent and without being vulnerable to the user being tricked (e.g. through click-jacking). There then needs to be a way to transmit this stream of video to another peer's browser, for display. There is also a need to display the stream locally. For real-time games, we need a mechanism that can send data to a peer or server, where the latency is more important than reliability. That is, it's not critical that packets arrive in order (or at all); once a packet is late, it's useless and there is no reason to resend it. For example, transient game state information becomes stale quickly. (Video is similar; only data for the latest frame is interesting: data for older frames is not useful.) For peer-to-peer file transfer, we need a mechanism to send data, either in the form of text or binary, to another peer. In all three cases, we need a mechanism to establish a connection to another peer. This is a non-trivial problem, because of NAT and firewalls. Therefore, we may need help from a third-party server to establish the connection. Ideally, I'd like the solution to involve no special code at the JS level, and I'd like the server-side connection helper to be something that can be implemented by just setting up a server with no special code and minimal configuration. Thus all the complexity would be in the browsers and in these reusable servers. The only work the author needs to do in this vision is have scripts running in the browsers of both peers decide that they want to connect to each other, and have them exchange some opaque information and information about the helper server, e.g. by sending a message through XMLHttpRequest to the server to bounce to the other user. To do this, there have to be three configuration strings: 1. A description of the helper server to give to the browsers, so they know who to contact to get things started. 2. A description of the first peer that contains all the information the second peer needs to connect to the first peer. 3. A description of the second peer that contains all the information the first peer needs to connect to the second peer. The server can generate the data for 1. The browser for the first peer can generate the data for 2. The browser for the second peer can generate the data for 3. The script then just needs to get the data for 1 and give it to the two peers, the data for 2 and give it to the second peer, and the data for 3 and give it to the first peer. Once the browsers have the information, they can set up a connection to each other using the helper server. For example, in the simplest case, the information for 2 is just the IP address and listening port of the first peer, the information for 3 is just the IP address and listening port of the second peer, and the browsers just have to each open a TCP connection to each other to set up a control channel over which everything else can be done. Now of course things aren't that simple -- in practice the information will have to include details such as keys to make sure the browsers can prove to each other that they are who they expect to be, there will have to be information obtained from the helper server (e.g. the IP address of the NAT router), and there will likely be information about what protocols are supported, and so on. None of this has to be exposed to the script, though, which is the great thing about this mechanism. So what is missing here? Well the main thing missing in this vision is the format of the configuration strings, and rules for how they are to be interpreted. We need a specification that defines how the helper servers and browsers are to interact, and how, once they have a control channel set up, how they are to set up further channels such as a way to be ready to receive UDP packets for a video stream, how to get ready to receive a file, and so on. I believe this is a self-contained problem, which can be defined as a black box that exposes only the following: Browser side: Methods to: - add a configuration string from the other peer - add a configuration string from the helper server - open a connection - close a connection - send plain text reliably - send plain text unreliably, with low latency - send binary data reliably - send binary data unreliably, with low latency - start sending a stream - stop sending a stream Callbacks for: - getting a configuration string for the other side - the connection being established - a permanent failure to establish a connection - the connection being shut down - receiving text - receiving binary data - being notified of a new stream being received - being notified that a stream has been dropped Server side: Callbacks for: - getting a configuration string for the browsers Everything else (how the connections work, what the configuration strings look like, etc) would just be something internal to that specification, and separate from HTML. The black box could also include details like how to negotiate different bitrates for video streams; these again don't have to be exposed to the script, at least not in the first version. Since the browser is the one generating the stream and the one doing the connection work, the black box can directly control the stream generation as well. If this is something that resonates with people, then what we really need is for a group of people to get together and design this black box. If people would like to do it through the WHATWG I'd be happy to create a mailing list and whatever other infrastructure is needed for this purpose; alternatively it could be done at the IETF or in some other group. If there are any questions about this, please feel free to ask them, I'll try to explain what I invision here in more detail. If this isn't a direction people are interested in, then that's fine too. :-) -- Ian Hickson U+1047E )\._.,--....,'``. fL http://ln.hixie.ch/ U+263A /, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
Received on Tuesday, 13 July 2010 16:49:52 UTC