Re: [Bug 15206] Add API for sending and receiving p2p application data from Justin Uberti on 2012-01-30 (public-webrtc@w3.org from January 2012)

From: Justin Uberti <juberti@google.com>
Date: Sun, 29 Jan 2012 20:05:11 -0500
To: Randell Jesup <randell-ietf@jesup.org>
Cc: public-webrtc@w3.org
Message-ID: <CAOJ7v-1e_CjvQJB-pQhLHJQtAKpBJVe6Po3MjxOT+hmmYUO8uQ@mail.gmail.com>
On Fri, Jan 27, 2012 at 6:30 AM, Randell Jesup <randell-ietf@jesup.org>wrote:

> My apologies up front for the length...  I just worry this is a rabbit
> hole that the entire working group could go into and never return... :-)
> Perhaps I'm wrong, of course, or perhaps there's something very tightly
> defined and limited we should do - I'm open to convincing on this.
>
>
> On 1/26/2012 5:45 PM, Justin Uberti wrote:
>
>>
>>
>> On Thu, Jan 26, 2012 at 7:32 PM, Randell Jesup <randell-ietf@jesup.org
>> <mailto:randell-ietf@jesup.org**>> wrote:
>>
>
>     I'll note that creating a datastream here requires a full
>>    renegotiation of the entire peerconnection (though one assumes the
>>    other items don't change state and thus would end up being no-ops
>>    after lots of processing occurs.
>>
>>
>> Not true if JSEP is used; with JSEP you can handle this however you see
>> fit, including having a set of pre-allocated channels that you use
>> whenever, or having your own application protocol (like what you mention
>> above) that allocates channels.
>>
>
> To be honest, you could do the same thing in ROAP I believe.  The point
> was that being application-specific and multiplexed with the other SCTP
> data streams, they may not need any explicit handling by the negotiation
> code.
>
> As for my assertion about data channels being application-specific:
>
> The contents of a data channel are not defined in these protocols
> currently.  Even if they're included in the offers and answers, without
> additional context describing them they're inherently application-specific.
>
> This is, of course, much like how UDP & TCP ports work - without some
> other way to bind a port to a purpose (portmapper, etc) you have to use a
> fixed "well-known" port or a fixed application port, or you have to use an
> application protocol (like SDP) to define what data in what format you're
> going to put on that port.
>
> We would need an entire framework for applications to tag and describe
> their data channels (for those cases where the data format is public and
> stable), or a JSON or RPC-like self-descriptive data format).  And the
> description would have to not just be format, but probably also semantics -
> what the object represented and why and what interactivity with the data
> channel was expected.
>
> We could describe the data channel with a MIME type, which might suffice
> for cases where the data is static (a file type of some sort); that doesn't
> handle all the other cases of dynamic data channels, and in many cases
> wouldn't give you the needed context to tell the receiving party what to do
> with it.  If an app gets a data channel from a different app with text/html
> in it, what does it do with that?
>
> I don't think all this work would in the end actually add anything useful;
> I think cross-application data-channels are not a problem we should try to
> solve - at best leave that for the applications or another spec to layer on
> top of WebRTC.
>

This is not a rabbit hole I'm interested in going down, either; sorry for
not making this clear in my draft. I added data channels to signaling
because I think we need some mechanism to govern how a single app can
handle multiple channels; namely:
- when a new data channel is created, how is the remote app and user-agent
notified of this fact?
- how are data channels correlated between local and remote sides (i.e. how
is 'label' communicated?)
- how are other fundamental properties of data channels (e.g.
reliable/unreliable state) communicated to the remote app and user-agent?

>
> So for me, the one remaining argument is multi-party via a central hub,
> using the same app.  ("Same app" to me also means different apps that
> explicitly know about each other and how to use the data protocols with
> that app.)




> In some cases, the application protocol used would handle data channels in
> a star configuration by having the central node process them and distribute
> data over a single/set of data channels with each user.  A typical example
> would be a game server that processes incoming events and distributes state
> changes and events to each player.
>

Right; we can assume that this case is necessarily application-specific.

>
> In other cases, the hub might proxy the data channels from each
> participant through to the others.  An example of this might be a
> conference where a participant shared a document with the other
> participants.  In this case, the hub would need to take an incoming data
> channel and duplicate it to a data channel with each of the other
> participants.  This data channel opened with each of the other participants
> would likely be identical to the one the sender opened to it, except that
> the hub would likely want to also inform each receiver as to the origin of
> the data.


Right; one way of doing this would be to allocate a distinct data channel
to represent each remote participant (like we would do with MediaStreams),
using label or a similar technique to associate them.


>  I'll blindly assert that this notification should be done through
> application-specific logic unless, again, we want to get deeply into
> describing the data.
>

I think this could be done in a generic manner using the mechanisms I
propose to solve the basic multiple channel issues above.


>
>
>>
>>    There are alternatives that would not require renegotations of the
>>    peerconnection, as mentioned in a previous message thread titled
>>    "Data channel setup and signalling", starting 11/15/2011, or would
>>    only *require* negotiating one channel and use an
>>    application-specific protocol over that one to open additional ones.
>>
>>    http://www.ietf.org/mail-__**archive/web/rtcweb/current/__**
>> msg02861.html<http://www.ietf.org/mail-__archive/web/rtcweb/current/__msg02861.html><
>> http://www.ietf.org/mail-**archive/web/rtcweb/current/**msg02861.html<http://www.ietf.org/mail-archive/web/rtcweb/current/msg02861.html>
>> >
>>
>>
>>    I'll repeat one bit from that which is relevant:
>>
>>          The other option is to have (some) data channels be separate
>>        from media, in
>>          particular app-specific anonymous data channels.  There's no
>>        requirement for
>>          describing the channels if they're private to the app, at
>>        least to the first
>>          approximation.  An app could pre-define data channel 3 as a
>>        private message
>>          structure for game map updates, so long as it knows its
>>        talking to itself.
>>
>>
>>
>>    Unless there's some higher-level structure imposed on the data
>>    somehow, data channels are inherently application-specific, so
>>    generic stream specifications and "negotiation" of them actually
>>    buys you very little (except *maybe* the first one).  This is
>>    different from audio and video streams and their encodings.
>>
>>
>> I thought about this a lot, but I think we probably want to impose
>> higher-level structure, for two reasons:
>> 1) Multiuser. Hasn't been talked about too much in the context of data
>> channels, but without getting into too many details I think this is a
>> critical use case. In multiuser, you need to be informed of what
>> channels are available, changes as new channels become available, what
>> their reliable/unreliable status is, what user they correspond to, etc.
>> I admit that much of this could be done completely by the application,
>> despite being inconsistent with the handling of audio and video, but
>> there is an important exception:
>> 2) 3rd party backend services. WebRTC is already spawning a number of
>> startups creating AWS-style signaling, gatewaying, and conferencing
>> services. For conferencing, the service will need to know the details of
>> how multiple streams are configured within a WebRTC session, as well as
>> the wire protocol, so that it can route the data properly without
>> needing special cases for each application (just like audio and video).
>> This implies we need to specify a way to negotiate the stream
>> configuration.
>>
>
> This is basically saying (if I understand correctly) that people are/will
> be creating generic conferencing and interconnect hubs/gateways, and that
> these need to understand the data channels in order to deal with connecting
> them in a multi-party conference.  For example, the "share a file" example
> above.
>
> My suggestion for this case is that the data channels provided by an app
> calling into these would need to somehow be tagged with a
> known-to-the-conference identifier, which would then know what to do with
> the datastream.
>
> The best usecase for this I can think of is a heterogeneous conference
> server with a mix of clients, where data channels offered by an user would
> be proxied (with added info as to the source) to each of the other users,
> who would only process it if they understood the data
> format/description/purpose.  same-apps in the conference might easily do
> so, while different-apps might well ignore such data channels unless in a
> "well-known" format/tag - which means defining those formats/tags. Again, I
> think instead of pulling all of that into this effort, leave that effort to
> the applications to sort out and standardize separately if they wish.
>
> Maybe find a standard way to define the MIME type of the stream when it's
> opened is the best, mimimal solution, with anything complex or
> application-defined or needing context being opaque binary.  (Even
> standardizing identifying the source is a bit tricky in this context.)
>
>
>
>> I'm certainly not dead-set on this approach, but as we've started
>> building this stuff out, I'm finding that in many respects data has the
>> same needs as audio and video.
>>
>
> Ok, but we have a well-set language and semantics for those (SDP + RFCs
> defining what a audio/PCMU or audio/Opus, etc type means - data format,
> negotiation semantics, etc, etc) that we're leveraging.  We don't have that
> for "data", and I worry trying to go down that path leads to a long, long
> period defining data interop & description methods, with minimal payback.
>  I'm not saying it wouldn't be nice to have that, especially for the
> heterogeneous conf case above, but I think that could be defined or
> defacto-standardized in the apps and servers.
>
>
I think there's a middle ground where we just inform the other side of the
'pipes' that we have, and the content of those pipes is irrelevant.
Received on Monday, 30 January 2012 01:06:02 UTC