Re: Proposal: Remove stopped transceivers to prevent resource exhaustion from Henrik Boström on 2019-02-15 (public-webrtc@w3.org from February 2019)

From: Henrik Boström <hbos@google.com>
Date: Fri, 15 Feb 2019 23:03:15 +0100
To: Jan-Ivar Bruaroey <jib@mozilla.com>
Cc: public-webrtc@w3.org
Message-ID: <CAEbRw2xPaupubw4-4bLiX-HcaqagC_RZ+WpPdpJUDkr30R16pQ@mail.gmail.com>
As always, this is more complicated than I expected.

On Fri, Feb 15, 2019 at 6:17 PM youenn fablet <yfablet@apple.com> wrote:

> On Feb 15, 2019, at 3:02 AM, Henrik Boström <hbos@google.com> wrote:
> ... This prevents the transceivers, and thus the associated
> senders/receivers and tracks, from being garbage collected. A stress test
> running Chrome that did add/removeTrack a lot ran out of memory when we
> switched from Plan B to Unified Plan due to this resulting in a few
> thousand transceivers in less than a minute. (I don't know the true
> footprint, but even if each transceiver just has a single video frame in
> its buffer of 1280x720x4, that's several megabytes, multiplied by thousands
> of transceivers => gigabytes of data).
>
>
> I agree with the concern.
> That said, implementations could probably do some clean-up when a
> transceiver is stopped so that it becomes very lightweight.
> Is there anything preventing implementations to do so?
>

Good point. The transceiver is irreversibly stopped, and its track is
irreversibly ended. I think it may be lightweight, but only if we don't
have to store the a frame on the ended track.
What happens if you attach an ended track to a video element? Does it show
an the last received frame or no frame (black)? This may be a central
question to the importance or non-importance of the resource issue (and
whether the OOM was a Chrome bug or a spec concern).


On Fri, Feb 15, 2019 at 7:06 PM Jan-Ivar Bruaroey <jib@mozilla.com> wrote:

> Thanks Henrik for a good summary! I agree we should do something, but I'd
> caution against big changes. Comments inline.
>
> On 2/15/19 6:02 AM, Henrik Boström wrote:
>
> ... TL;DR: Keeping transceivers around prevents garbage collection of a
> possibly ever-growing list of tracks. Can we stop doing that?
>
> If an application does addTrack() and removeTrack() with renegotiations
> in-between, the list of m= sections and transceivers will grow over time.
>
> This will remain true. Developers who operate with the
> addTrack()/removeTrack() symmetric mental model, will continue to
> experience this side-effect. transceiver.stop() does not rescue that mental
> model, since it stops both directions.
>
> So the short answer to your "Can we stop doing that?" question is
> "probably not".
>

Good point. I don't think we'll fix the problem for people who have the
wrong mental model. Stopping a transceiver would have to be a conscious
act, where you don't stop a transceiver that may be used for receiving.

That said, I agree we should fix the main side-effect of transceiver.stop()
> you point out: that it holds on to objects long after the underlying
> machinery it's supposed to represent (the network negotiation slot) has
> been repurposed. Transceivers were meant to be true to the underlying
> behaviors, and this fails that.
>
> ... Related issues/questions:
>
>    1. If you care about mids of stopped transceivers you'd have to record
>    them yourself.
>    2. There is no "onstopped" event in case a transceiver is stopped due
>    to setRemoteDescription(). Should we add one, or is it enough that the
>    application checks whether the transceiver still exists by querying
>    getTransceivers() in track.onmute?
>
> `mute` only fires remotely AFAIK. However, transceiver.stop() [1] does
> reliably fire `ended` on transceiver.receiver.track on both ends. [2] But
> in neither case are you given a pointer to the transceiver.
>
Yes, "onended" is the one we want not "onmute". Do you need the transceiver
pointer though? When you add a listener to "onended" you know the
transceiver already (or you can look it up in the pc using the track).
You can do this:

function onstoppedPromise(pc, transceiver) {
  return new Promise(resolve => {
    transceiver.receiver.track.onended => {
      resolve(transceiver);
    }
  });
}
pc.ontrack = e =>
onstoppedPromise(e.transceiver).then(processEndedTransceiver);

> Please discuss.
>
> I think you hit on the most confusing situation: where the other side
> stopped your transceiver.
>
> The current spec API for this is to examine the boolean
> `transceiver.stopped`, once you return to "stable". I think that works, has
> a couple of years of this working group behind it, Firefox implements it,
> and it has neither of the issues you mention.
>
> With the change you propose, the transceiver would effectively disappear
> from all lists we maintain for developers, before developers get a chance
> to react. Finding out which transceiver(s) were stopped might be hard.
> Ironically, the `transceiver.stopped` API would still exist, but developers
> would have to cache the list ahead of time to use it.
>
How about instead:
>
>    - we remove the stopped transceiver on subsequent renegotiation, when
>    its m-line is reused?—If we're clever, we may even be able to insert the
>    new transceiver that inherits the m-line, in its place, preserving the
>    index order of the other transceivers in getTransceivers().
>
> Thoughts?
>
I see the benefit of keeping the transceiver around during negotiation; it
might be nice to have an easily accessible list of transceivers that is
complete and comparable to the SDP, including stopped ones.

This is where it gets hairy:
> *If we're clever, we may even be able to insert the new transceiver that
inherits the m-line, in its place, preserving the index order of the other
transceivers in getTransceivers()*
The "new transceiver" might not be newly created, it might be a transceiver
that was created (at index *n*) and is now tied to a previous m-line, say
index *i*. Do we move the transceiver?

A lot boils down to what we want to be true about transceiver indices: what
do we want to be true?

I believe the statement "the transceiver index is the SDP m-line index" is *not
true* when both endpoints add transceivers before negotiating, and possibly
other edge cases. What *is true* is "the transceivers are listed in
insertion order". If you add in-place insertion of recycled m-sections, or
move transceivers around, you would sacrifice what was always true
(insertion order) in order to maintain support for what was sometimes true
but not always (transceiver index matches SDP). I'm not sure maintaining
this illusion is worth the complexity, especially when mid is what you
*should* use when comparing SDP or the other endpoint's set of transceivers.

Here's a third option:

   - Stopping a transceiver does not immediately remove it from the set of
   transceivers, but when the signaling state goes from any non-stable state
   to stable, all stopped transceivers are removed.

It doesn't try to make the index match (if you can't guarantee, don't try)
but it maintains insertion order, and stopping a transceiver with stop() or
setRemoteDescription(offer) does not immediately make it disappear. But if
you are the offerer, transceivers stopped by the answer would disappear
when you do setRemoteDescription(answer).

Is that better or worse?

> .: Jan-Ivar :.
>
> [1] http://w3c.github.io/webrtc-pc/#dom-rtcrtptransceiver-stop
> [2] https://blog.mozilla.org/webrtc/rtcrtptransceiver-explored/
>
Received on Friday, 15 February 2019 22:03:51 UTC