Re: SDP is not suitable for WebRTC from Harald Alvestrand on 2013-07-30 (public-webrtc@w3.org from July 2013)

From: Harald Alvestrand <harald@alvestrand.no>
Date: Tue, 30 Jul 2013 11:37:33 +0200
To: public-webrtc@w3.org
Message-ID: <51F7895D.2020202@alvestrand.no>
On 07/30/2013 11:16 AM, Iñaki Baz Castillo wrote:
> 2013/7/30 Harald Alvestrand <harald@alvestrand.no>:
>>> Sorry for the cross-posting but at this point I'm a bit lost and do
>>> not know which is the appropriate group for my concern.
>> For API issues, it's WebRTC.
>>
>> For SDP issues, it's MMUSIC.
>
> And for API issues that exist due to SDP nature/limitations?

WebRTC.
>
> Honestly, I don't think I should go to MMUSIC and tell that
> Plan-Unified is not good since the WebRTC API does not let me to
> generate a SDP offer with multiple m=audio lines with a=recvonly. They
> will reply me "that's is not an issue of SDP itself but an issue of
> the WebRTC API".
>
> Anyhow, my concern is about how to design WebRTC applications given
> the current SDP-blob based API and the SDP definition (Plan-Unified)
> itself.

Good. Then please, let's separate the discussion of that from the
discussion of whether or not we should have SDP in the API.
>
>
>
>
>
>>> So my concern is:
>>>
>>>
>>> - Web application with a SIP over WebSocket client running in the web.
>> Do you really mean SIP here (which means that you've already bought into
>> using SDP and only SDP for your session descriptions), or do you mean "a
>> signalling protocol"?
> It should not matter, but yes, I meant "SIP".
>
>
>
>> I'm assuming you means SIP below.
> OK
>
>
>
>>> - The web user is provided with a conference SIP URI in which there
>>> are *already* 8 participants (5 of them emitting audio and video and 3
>>> just emitting audio).
>>>
>>> - The user calls, from his webphone, to the given URI to join the conference.
>>>
>>>
>>>
>>> Let's imagine that the JS app knows the number of participant in the conference.
>>> Let's imagine my browser have mic and webcam.
>>>
>>>
>>>
>>> QUESTION:
>>>
>>> How can my browser join the conference without requiring SDP
>>> renegotiation from the server and, at the same time, being able to
>>> send audio/video and receive audio/video from others (different tracks
>>> / m=lines)?
>>>
>>>
>>>
>>>
>>> "SOLUTIONS":
>>>
>>>
>>>
>>> 1)
>>>
>>> I tell my browser to generate a SDP offer with:
>>>
>>>   - 1 send/receive m=audio line.
>>>   - 7 recvonly m=audio line.
>>>   - 1 send/only m=video line.
>>>   - 4 recvonly m=video line.
>>>
>>> (Obviously this is a joke)
>
>> Given your constraints above (SIP, previous knowledge of the number of
>> active participants), what's obvious about this being a joke?
> Please, let me know how to do that (without mangling the SDP).

One way is to suggest an API change where OfferToReceiveVideo is changed
from taking a boolean to taking an integer (or defining that it takes
either, where a boolean true is interpreted as integer 1).

>
>
> Anyhow, as I've said in other mail: why should my browser know (before
> calling) the
> number of participants in the conference?

You already said that it knows..... I can't answer the question that you
didn't ask.

>
> My browser should be able to tell the conference server:
>
> - These are my audio and video tracks (2 tracks).
>
> And the server should be able to accept the "call" and reply:
>
> - OK, and these are my multiple audio and video tracks (13 tracks).
>
> And that's all.But this is NOT possible with SDP due to SDP nature and
> limitations.

As I pointed out in scenario 4): Only if you're fixated on having a
single offer/answer and having the first offer come from the browser.
You have not given a rationale for either of those constraints.

>
>
>
>
>
>>> 2)
>>>
>>> SDP seems to allow that the offer and the answer have different number
>>> of m lines (I'm not aware of that but I believe that SDP can do
>>> "everything").
>> If you believe that, you'll have a hard time dealing with the real world.
>>
>> As far as I understand it:
>> - An answer always has the same number of M-lines as the offer
> Right. It was my fault (I wrongly understood a mail from Christer in MMUSIC).
>
>
>> - A renegotiating offer always has at least as many M-lines as the last
>> offer/answer
> Yes.
>
>
>
>
>>> 3)
>>>
>>> My browser generates a SDP offer with 1 m=audio line and 1 m=video
>>> line and the server too. And later the server sends re-INVITE with all
>>> the m lines.
>>>
>>> Oppss, SDP renegotiation...
>
>> And why is that a problem, exactly?
> Forcing renegotiation when there is no need for that in a well
> designed and modern media signaling protocol (i.e. any custom media
> signaling protocol a JS developer could create on top of a real JS
> Object based API for WebRTC).

Actually I suspect that the need for a (minimum) 3-way handshake is
fundamental to the problem you're posing, so double offer/answer is just
half a round-trip more expensive than the ideal case.

>
>
>
>
>> And you did not include the scenario I'd prefer if I was operating
>> within the limitations you mention (SIP over WebSockets):
>>
>> 4)
>>
>> The *server* generates an SDP offer with 8 m=audio lines and 4 m=video
>> lines.
>> One of each is sendrecv. The browser answers. End of story.
> No, I want to call to the conference SIP URI from my webphone
> application (and not the reverse).
>
> I hope WebRTC applications design is not so constrained by the current
> SDP blob based API, and I can design my own application in which I
> want that the client initiates the call, without that meaning that I
> need later SDP O/A re-negotiation.

Why?
You already said that it's your application, it's not interoperating
with anything existing. And the client already initiated the call (by
loading the page and setting up the WS). Why do you insist that it
should also be the one making the SIP OFFER?

>
>
>
>> I don't know how people get so hung up on the entity joining the call
>> being the one to send the SDP offer; there's nothing in the protocol
>> requiring that, and just the loading of the Web page and opening of the
>> WS connection has already caused multiple round trips between the
>> browser and the server.
> If I want to do arbitrary round trips I will do them, but let me
> choose where and how. Don't mandate me to do that with SDP just
> because SDP or Plan-Unified is not suitable for the common scenario I
> show above.

Then you need to stop talking about using SIP. SIP constrains you much
more than SDP alone does.

>
>
>>> SDP is bad for WebRTC. SDP is good for legacy symmetric communications
>>> in which there is a single-track audio communication and, of course,
>>> both endpoints emit audio. But SDP is bad for modern RTC protocols in
>>> which an endpoint can emit tons of tracks to a single endpoint.
>>>
>>>
>>> Do we really want this for WebRTC 1.0 ?
>> I see many issues with the use of SDP.
>>
>> one of the requirements is "we have to be able to produce and consume SDP from the data available on the API"
> That's a very good argument in favour of SDP: "Based on a bad decision
> taken years ago, WebRTC must deal with SDP and thus we must deal with
> SDP now, and then it is good to have SDP since there is a requirement
> mandating it".
>
That's not what I said. Thanks for not llistening.

>
Received on Tuesday, 30 July 2013 09:38:37 UTC