Re: [mediacapture-image] Clarify grabFrame timing (#212)

@alvestrand

> The problem with grabbing the previous frame is that it's possible that the previous frame is no longer easily accessible (might have gone to display buffers), and may even not exist (in the case of before-first-frame).

That's not overly complicated, though. I rewrote the Chromium implementation for `grabFrame` as part of exploring #210, and really all you need to do is keep around a reference to the last frame until the next one comes in. This incurs a small memory usage penalty, but for 1080p it's just under 8 MB per frame, assuming RGBA, and only 3 MB if it's held as I420, like Chromium does it. 4K is a bit more, at 32 MB and 12 MB respectively. If that's too much to stomach, options could be added to the `ImageCapture` constructor to customize this behavior.

> Is there any implementation that grabs the previous frame? If not, I suggest we clarify the behavior to mean "next frame".

Isn't Chromium the only implementation, currently? Isn't one of the benefits of implementing early to learn this sort of stuff? I understand not wanting to change a widely implemented, widely used spec, but in this case I think it's reasonable to be more agile here since it's still new. This spec has only been published as a W3C Working Draft, it's not set in stone. Isn't that the point of the draft terminology, that it may change?

> Function name changes are breaking to the deployed base, so shouldn't be done lightly.

It wouldn't need to be a breaking change. Currently the timing of `grabFrame` is not defined in the spec, so a change to that is 'fair game'. So you could change the behavior of 'grabFrame' to be 'current frame' and add 'grabNextFrame' if that's desired functionality.

> may even not exist (in the case of before-first-frame).

This problem exists even when waiting for next frame, because there may not be a next frame, or the next frame may come after an arbitrary long amount of time. In fact, I'd consider it worse, since a stream can only have a 'before-first-frame' once (and it's easy to know this if you're tracking the last seen frame), where as a stream could have a 'no-next-frame' happen at any time. For no first frame you could easily reject until there is a first frame - you can't reject if you're waiting for a next frame which never comes, without making an arbitrary decision.

This is especially true with remote video being sent by WebRTC, or creating a stream [from a canvas](https://developer.mozilla.org/en-US/docs/Web/API/HTMLCanvasElement/captureStream), or [from a video element](https://developer.mozilla.org/en-US/docs/Web/API/HTMLMediaElement/captureStream).

With canvas the stream only sends out a new frame when the content changes, which means if you 'miss the boat' on calling `grabFrame`, you end up with a promise which will never resolve, or could be a large amount of time later with a frame entirely different than what you expected to get.

With video, if paused, there will be no new frames on the stream, so again, you may end up with a promise that never resolves or takes a long time.

In both of those cases getting the next frame may not be the desired result, and would be confusing behavior to developers. A promise pending on grabbing a frame from a paused video will come out to be an unexpected frame if the video jumps ahead and resumes playing.

This isn't just speculation, there's actually [a developer bug for Chromium](https://bugs.chromium.org/p/chromium/issues/detail?id=972939) (which I just noticed today) which looks like it's running into this exact issue, and they're using `HTMLMediaElement.captureStream`. I've modified their reproduce case to have a 2 second timeout, which the resolved promise will cancel, and if the timeout is never canceled it will declare the promise unresolved. Doing this shows that several promises get left unresolved. Here is the code that developer was trying to use which led to a never resolved promise (trimmed for readability):

```javascript
new ReadableStream({
  async start(controller) {
    console.log("reader start");
  },
  async pull(controller) {
    if (e.track.enabled === false && e.track.readyState === "ended") {
      controller.close();
    } else {
      const frame = await imageCapture.grabFrame();
      controller.enqueue(frame);
    }
  }
})
```

There's two potential issues in that code since `grabFrame` may never resolve. First, according to the `ReadableStream` spec, it will wait on `pull` if it returns a promise, before calling pull again. Since they used `await`, the `grabFrame` promise never resolving means that `pull` will never resolve either. The second issue (which is breaking their code) is that when `grabFrame` is left unresolved there's never a call to `controller.enqueue` and so `controller.close` never gets called, which is why that developer opened the Chromium bug.

I consider the wait for next frame behavior dangerous for reasons like this. Promises that may never resolve or reject are bad form and dangerous for developers. Implementations *could* detect things like the stream muting, or ending, and reject any unresolved promise, but the spec doesn't mention this, obviously, and the Chromium implementation doesn't do it. That behavior also has its own can of worms, as promises that seem flaky and reject unexpectedly will lead to bad code to deal with that.

-- 
GitHub Notification of comment by dsanders11
Please view or discuss this issue at https://github.com/w3c/mediacapture-image/issues/212#issuecomment-512747322 using your GitHub account

Received on Thursday, 18 July 2019 09:50:37 UTC