Re: [Bug 15206] Add API for sending and receiving p2p application data from Randell Jesup on 2012-01-27 (public-webrtc@w3.org from January 2012)

From: Randell Jesup <randell-ietf@jesup.org>
Date: Fri, 27 Jan 2012 03:30:34 -0800
To: public-webrtc@w3.org
Message-ID: <4F228ADA.5080602@jesup.org>
My apologies up front for the length...  I just worry this is a rabbit 
hole that the entire working group could go into and never return... :-)
Perhaps I'm wrong, of course, or perhaps there's something very tightly 
defined and limited we should do - I'm open to convincing on this.

On 1/26/2012 5:45 PM, Justin Uberti wrote:
>
>
> On Thu, Jan 26, 2012 at 7:32 PM, Randell Jesup <randell-ietf@jesup.org
> <mailto:randell-ietf@jesup.org>> wrote:

>     I'll note that creating a datastream here requires a full
>     renegotiation of the entire peerconnection (though one assumes the
>     other items don't change state and thus would end up being no-ops
>     after lots of processing occurs.
>
>
> Not true if JSEP is used; with JSEP you can handle this however you see
> fit, including having a set of pre-allocated channels that you use
> whenever, or having your own application protocol (like what you mention
> above) that allocates channels.

To be honest, you could do the same thing in ROAP I believe.  The point 
was that being application-specific and multiplexed with the other SCTP 
data streams, they may not need any explicit handling by the negotiation 
code.

As for my assertion about data channels being application-specific:

The contents of a data channel are not defined in these protocols 
currently.  Even if they're included in the offers and answers, without 
additional context describing them they're inherently application-specific.

This is, of course, much like how UDP & TCP ports work - without some 
other way to bind a port to a purpose (portmapper, etc) you have to use 
a fixed "well-known" port or a fixed application port, or you have to 
use an application protocol (like SDP) to define what data in what 
format you're going to put on that port.

We would need an entire framework for applications to tag and describe 
their data channels (for those cases where the data format is public and 
stable), or a JSON or RPC-like self-descriptive data format).  And the 
description would have to not just be format, but probably also 
semantics - what the object represented and why and what interactivity 
with the data channel was expected.

We could describe the data channel with a MIME type, which might suffice 
for cases where the data is static (a file type of some sort); that 
doesn't handle all the other cases of dynamic data channels, and in many 
cases wouldn't give you the needed context to tell the receiving party 
what to do with it.  If an app gets a data channel from a different app 
with text/html in it, what does it do with that?

I don't think all this work would in the end actually add anything 
useful; I think cross-application data-channels are not a problem we 
should try to solve - at best leave that for the applications or another 
spec to layer on top of WebRTC.

So for me, the one remaining argument is multi-party via a central hub, 
using the same app.  ("Same app" to me also means different apps that 
explicitly know about each other and how to use the data protocols with 
that app.)

In some cases, the application protocol used would handle data channels 
in a star configuration by having the central node process them and 
distribute data over a single/set of data channels with each user.  A 
typical example would be a game server that processes incoming events 
and distributes state changes and events to each player.

In other cases, the hub might proxy the data channels from each 
participant through to the others.  An example of this might be a 
conference where a participant shared a document with the other 
participants.  In this case, the hub would need to take an incoming data 
channel and duplicate it to a data channel with each of the other 
participants.  This data channel opened with each of the other 
participants would likely be identical to the one the sender opened to 
it, except that the hub would likely want to also inform each receiver 
as to the origin of the data.  I'll blindly assert that this 
notification should be done through application-specific logic unless, 
again, we want to get deeply into describing the data.

>
>
>     There are alternatives that would not require renegotations of the
>     peerconnection, as mentioned in a previous message thread titled
>     "Data channel setup and signalling", starting 11/15/2011, or would
>     only *require* negotiating one channel and use an
>     application-specific protocol over that one to open additional ones.
>
>     http://www.ietf.org/mail-__archive/web/rtcweb/current/__msg02861.html <http://www.ietf.org/mail-archive/web/rtcweb/current/msg02861.html>
>
>     I'll repeat one bit from that which is relevant:
>
>           The other option is to have (some) data channels be separate
>         from media, in
>           particular app-specific anonymous data channels.  There's no
>         requirement for
>           describing the channels if they're private to the app, at
>         least to the first
>           approximation.  An app could pre-define data channel 3 as a
>         private message
>           structure for game map updates, so long as it knows its
>         talking to itself.
>
>
>
>     Unless there's some higher-level structure imposed on the data
>     somehow, data channels are inherently application-specific, so
>     generic stream specifications and "negotiation" of them actually
>     buys you very little (except *maybe* the first one).  This is
>     different from audio and video streams and their encodings.
>
>
> I thought about this a lot, but I think we probably want to impose
> higher-level structure, for two reasons:
> 1) Multiuser. Hasn't been talked about too much in the context of data
> channels, but without getting into too many details I think this is a
> critical use case. In multiuser, you need to be informed of what
> channels are available, changes as new channels become available, what
> their reliable/unreliable status is, what user they correspond to, etc.
> I admit that much of this could be done completely by the application,
> despite being inconsistent with the handling of audio and video, but
> there is an important exception:
> 2) 3rd party backend services. WebRTC is already spawning a number of
> startups creating AWS-style signaling, gatewaying, and conferencing
> services. For conferencing, the service will need to know the details of
> how multiple streams are configured within a WebRTC session, as well as
> the wire protocol, so that it can route the data properly without
> needing special cases for each application (just like audio and video).
> This implies we need to specify a way to negotiate the stream configuration.

This is basically saying (if I understand correctly) that people 
are/will be creating generic conferencing and interconnect 
hubs/gateways, and that these need to understand the data channels in 
order to deal with connecting them in a multi-party conference.  For 
example, the "share a file" example above.

My suggestion for this case is that the data channels provided by an app 
calling into these would need to somehow be tagged with a 
known-to-the-conference identifier, which would then know what to do 
with the datastream.

The best usecase for this I can think of is a heterogeneous conference 
server with a mix of clients, where data channels offered by an user 
would be proxied (with added info as to the source) to each of the other 
users, who would only process it if they understood the data 
format/description/purpose.  same-apps in the conference might easily do 
so, while different-apps might well ignore such data channels unless in 
a "well-known" format/tag - which means defining those formats/tags. 
Again, I think instead of pulling all of that into this effort, leave 
that effort to the applications to sort out and standardize separately 
if they wish.

Maybe find a standard way to define the MIME type of the stream when 
it's opened is the best, mimimal solution, with anything complex or 
application-defined or needing context being opaque binary.  (Even 
standardizing identifying the source is a bit tricky in this context.)

>
> I'm certainly not dead-set on this approach, but as we've started
> building this stuff out, I'm finding that in many respects data has the
> same needs as audio and video.

Ok, but we have a well-set language and semantics for those (SDP + RFCs 
defining what a audio/PCMU or audio/Opus, etc type means - data format, 
negotiation semantics, etc, etc) that we're leveraging.  We don't have 
that for "data", and I worry trying to go down that path leads to a 
long, long period defining data interop & description methods, with 
minimal payback.  I'm not saying it wouldn't be nice to have that, 
especially for the heterogeneous conf case above, but I think that could 
be defined or defacto-standardized in the apps and servers.


-- 
Randell Jesup
randell-ietf@jesup.org
Received on Friday, 27 January 2012 11:31:26 UTC