[minutes] October 14 2021


The minutes of our call today are available at:
(can you spot the new experimental feature of the IRC-based minutes

and copied as text below.


                  WebRTC October 2021 Virtual Interim

14 October 2021

   [2]IRC log.

      [2] https://www.w3.org/2021/10/14-webrtc-irc


          BernardA, BrianBaldino, Carine, CullenJennings, Dom,
          EladAlon, Guido, Harald, Jan-Ivar, PatrickRockhill,
          TimP, Youenn


          bernard, harald, jan-ivar



         1. [3]https://lists.w3.org/Archives/Public/www-archive/
         2. [4]The Streams Pipeline Model (Youennf)
         3. [5]Altnerative Mediacapture-transform API
         4. [6]Mediacapture Transform API
         5. [7]Wrap up and next steps

      [3] https://www.w3.org/2021/10/14-webrtc-minutes.html

Meeting minutes
       Slideset: [8]https://lists.w3.org/Archives/Public/
       [ [9]Slide 8 ]
       Bernard: [reviewing agenda]


The Streams Pipeline Model (Youennf)
       [ [10]Slide 9 ]
       [ [11]Slide 10 ]
       Youenn: this presentation is about topics and issues we
       discussed with Jan-Ivar when we explored using Streams for
       media pipelines
       … goal is to identify blocking issues when looking at
       adopting streams for media pipelines
       [ [12]Slide 11 ]
       Youenn: media pipelines connect sources with sinks
       … sources are readablestreams and sinks writablestreams
       … we would want to go from camera to network just using
       … I'll be focusing only on video pipelines
       … and we'll look at threads and intersection between frames
       and @@@
       [ [13]Slide 12 ]
       Youenn: dealing with realtime media is better done off the
       main thread
       … in the Web Audio API, the graph is done in the main
       thread but the processing is done in a dedicated audio
       … in our case, there is no dedicated thread
       … the safest assumption is to asusme the video frames flow
       where they're set up
       [ [14]Slide 13 ]
       Youenn: example 1 is a funny hat example using pipeThrough
       and pipeTo
       … it's not clear where the video frames would flow in terms
       of thread
       … the assumption would be that it runs in the same thread
       where these operations are being called
       … example 2 uses a JS transform
       … example 3 uses a tee - it makes it very unclear where it
       would be run, whether the UA would optimize it or not
       … so the safest assumption, with streams being a generic
       mechanism, is to assume same-thread
       [ [15]Slide 14 ]
       Youenn: one potential related idea is to transfer the
       stream to a worker
       … this requires optimizations that are not standard and
       hard to expose to Web developers
       … the current implementation in Chrome is also not
       … it's really hard to predict whether the optimization will
       kick in or not
       [ [16]Slide 15 ]
       Youenn: a few examples - example 1 is the typical example
       where chrome will optimize after a stream transfer
       … in example 2 - not clear whether optimization will happen
       … in example 3 - also unclear
       … and again in example 4, when using non-camera streams
       … let's say you transfer an MST to another frame, and then
       take a stream transfered to a worker - will it be
       optimized? as a developer, you can never know
       … as opposed to Web Audio that gives very clear spec'd
       [ [17]Slide 16 ]
       Youenn: streams are a generic tool designed for flexibility
       - we can't guarantee for performance
       … we can give that guarantee with transferable
       … this allows to avoid the issues associated with streams
       when dealing with realtime streams
       … additional optimizations can still happen as a bonus, but
       they're no longer a pre-requisite
       [ [18]Slide 17 ]
       Youenn: buffering with streams happens at each transform
       step in the media pipeline
       … a typical pipeline is like the one at the top, with
       greedy processing
       … but in cases you don't want to process all frames, e.g. a
       1-second old frame might be better skipped
       … as does mediastreamtrackgenerator
       … the second pipeline illustrates sequential processing
       which can be beneficial
       … I think that's a safer approach
       [ [19]Slide 18 ]
       Youenn: this is a real issue; videoframe are big and scarce
       … it's also unclear for web developers what happens;
       buffering is hidden from them
       … issue-1158 is where this is being described - there is
       probably a solution that will emerge
       … but it's unlikely that the default behavior will be the
       safe behavior for stream of frames
       [ [20]Slide 19 ]
       Youenn: in general for streams, the idea is that
       backpressure will deal with buffering
       … but for us, some limited buffering might be useful to
       … but it's hard to deal with WHATWG streams
       … the stream queue is opaque to the application by design
       … and the queuing strategy is very static, based on the
       high-water mark
       … updating the strategy requires resetting your pipeline
       … WHATWG streams might be able to cover the use case, but
       with complexity
       [ [21]Slide 20 ]
       Youenn: Tee is the typical way to allow multiple consumers
       with streams
       … tee is part of the design of the API so we should support
       [ [22]Slide 21 ]
       Youenn: but we know tee is broken when used with our
       videoframes stream
       … structured clone might solve this, as suggested in issue
       … but the default behavior again won't be the right one for
       [ [23]Slide 22 ]
       Youenn: but even with structured clone, more changes are
       … if you apply structureClone, you add hidden buffering
       … if the two branches don't consume data at the same pace
       … issue 1157 discusses this - so far, no clear solution to
       … streams by design aren't made to drop items
       [ [24]Slide 23 ]
       Youenn: the last issue I want to discuss is lifetime
       … streams rely on garbage collection, whereas we don't want
       to rely on GC for videoframe
       … there is no easy way to enforce who will close a
       VideoFrame, making it error prone for Web developers
       … there is no API contract, so unclear how to solve this
       <hta> +1
       Youenn: maybe a dedicated subclass with built-in memory
       … but no work has started in that direction
       … if you look at the pipeline - if you change the pipeline,
       you need to cancel streams
       … these streams might have buffer, which raises the
       question of GC again
       [ [25]Slide 25 ]
       Youenn: we need to solve these issues, buffering, tee and
       life management for VideoFrame
       … there has been progress, but more is needed and it's
       unclear to me how far we can go
       [ [26]Slide 27 ]
       Youenn: having a high level confidence that these issues
       can be solved before picking it as our model for designing
       our APis
       … if we select streams, we should extend support for them
       in existing and new API (e.g. videodecoder/encoder,
       … this doesn't seem to be part of the plans for e.g.
       Jan-Ivar: a couple of comments
       … on backpressure, I believe with a transformstream and
       highwatermark of 0 will automatically call backpressure
       … wrt dynamic buffering, highwatermark is indeed static,
       but dynamic buffering can be dealt with a transformstream -
       but not with a high water mark of 0
       Youenn: I'm not optimistic of seeing the problem solved at
       the source level
       … my understanding with life time management is that there
       is no API contract
       … you don't know if close will be called; I like
       … memory management would be something we would want to
       design carefully
       <hta> I intended to write q+
       Bernard: in the current model where we don't have
       Youenn: the camera pool might have 10 video frames; with a
       5 steps pipeline, 5 frames will be automatically allocated
       - this leaves only 5 remaining slots which might not be
       … and some devices might have a smaller buffer of frames
       … which will create variable framerates
       bernard: the lack of streams integration in webcodecs
       creates two queues that need to be managed
       … and that's not particularly transparent, something you
       have to keep track of
       … this can create significant memory management issues
       … wrapping streams is not particularly satisfactory in our
       Harald: a couple of observations
       … webcodecs did have a stream-based API for a while; MSTP
       and MSTG was the reason they got dropped
       … we've had very few people reporting problems with these
       … my impression is that the Stream model has been somewhat
       confused with the stream shim implementation
       … we should have a clean model where issues are moved to
       implementations, not the model
       … wrt tees, I have some experience with reading the CL that
       added tee to the spec
       … worries were expressed that are very similar to ours
       … tee is a bad design
       … it's fairly easy to write your own JS to get the tee you
       want, which is quite dependent on your app
       … tee doesn't respect the high water mark on down stream -
       tee is bad
       … on the contract point, I think it's natural to say that
       downstream either has to call close, or pass it to
       something that will call close on VideoFrame
       … we shouldn't depend on upstream to do anything
       … we do have an issue with disrupted pipeline - that needs
       to be solved
       … my conclusion is that some of these issues are with the
       description more than implementations, and some are issues
       we need to solve but aren't fatal
       … like tee - it's not because it's possible to use it badly
       that we shouldn't use streams
       … the streams API is superior to callbacks because it avoid
       re-doing it all
       Youenn: I agree with you that tee is bad - salvaging it
       will be difficult
       … doing one's tee in JS is indeed better - but you'll end
       up using promise-based callbacks
       … but if so, why using streams?
       … re other issues not being fatal, I would welcome
       proposals that address these concerns
       … at the moment, I'm not confident we can proceed with
       confidence that streams is a good enough match
       … if they can be solved, I agree that streams are appealing
       Jan-Ivar: all these issues filed on github are with the
       … they're not necessarily huge though, and I'm not sure we
       should block on them
       … given that one API is already shipping, I think we need
       to converge on a standard sooner rather than later
       Youenn: I'd be interested in getting a pro/cons comparison
       of promsise callbacks vs streams


Altnerative Mediacapture-transform API
       [ [27]Slide 30 ]
       jib: today, the realtime media pipeline is off main thread
       [ [28]Slide 31 ]
       jib: that remains true in webrtc-encoded-transform
       … the original chrome APi was on main thread, but we then
       converged on a standardized API off the main thread
       … this was importatn for encoded media, all the more so for
       aw media
       [ [29]Slide 32 ]
       jib: the premise here is that the main thread is bad -
       "overworked & underpaid" as surma qualified during a chrome
       dev summit in 2019
       … surma highlighted webworkers as the solution to that
       … contention on the main thread is common and unpredictable
       … and hard to detect outside of a controlled environment -
       as opposed to web workers
       [ [30]Slide 33 ]
       jib: when webcodecs made the decision to expose the API on
       the main thread, they based this on non-realtime media use
       … and they strongly encourage to do realtime processing off
       the main thread
       [ [31]Slide 34 ]
       jib: we have a non-adopted document
       "mediacapture-transform" (which has shipped in Chrome 94
       despite not being standardized)
       … my position is that this proposal is not satisfactory
       because it exposes realtime pipeline on main thread by
       default, it doesn't encourage use in workers, relies on
       non-standardized optimizations
       … also, now mediastreamtrack is transferable so this
       creates new opportunities
       [ [32]Slide 35 ]
       [ [33]Slide 36 ]
       jib: having to ask the main thread all the time to interact
       with the API makes sense
       … it's baked in the assumption of main thread
       hta: that's untrue
       [ [34]Slide 37 ]
       jib: for a processed (e.g. background replacement)
       self-view use case combined with webtransport
       … tee, clone, postMessage(constraints) aren't good
       … whereas with track available in a worker, we have a
       natural API
       [ [35]Slide 38 ]
       jib: the tunnel semantics of WHATWG streams are not meant
       to solve creating streams on the wrong realm
       … MSTP is built on broken assumptions
       [ [36]Slide 39 ]
       jib: I have an alternative proposal based on transferable
       … the proposal focuses on video at the moment
       … it encourages use on workers
       … it still uses streams, despite youenn's identified issues
       - which I think we can find solutions for
       [ [37]Slide 40 ]
       jib: we expose a readable attribute in a worker version of
       the MediaStreamTrack
       … this keeps data off the main thread
       [ [38]Slide 41 ]
       jib: a more complicated example, read & write
       … this is the equivalent of mediastreamtrackgenerator
       … we expose only on workers a new VideoTrackSource
       … the example is a crop example inspired from WebCodecs
       … it aligns better with the separate of source and track of
       the mediacapture-streams spec
       … it interacts well with clone and structured cloning
       [ [39]Slide 42 ]
       jib: for any video processing, you have a self-view (with
       high framerate) and a low-fps to send on the network
       … applyConstraints works well with a peerconnection
       [ [40]Slide 43 ]
       jib: now with WebTransport, using track cloning
       … this shows native downscaling with applyConstraints as a
       workaround to using tee
       … not clear how MSTG would let you do this via a worker
       [ [41]Slide 44 ]
       jib: benefits: simpler API taking advantage of transferable
       tracks, with fewer APIs to learn
       … doesn't block real-time media pipeline by default
       … it has parity with MSTP & MSTG features
       … similar in terms of brevity
       … doesn't rely on UA optimizations
       … and deal with muted sources
       [ [42]Slide 45 ]
       jib: Bonus: if we want promise callbacks for stream-based,
       you can use "for await" on the stream
       [ [43]Slide 46 ]
       jib: if you want more than a readable - this can be done
       with cloning, but we could also provide dedicated surface
       Harald: I kind of like the proposal - it's almost totally
       equivalent to MSTG and MSTP
       … the examples where you have posting messages to the main
       thread - MSTG and MSTP are designed to be available to the
       same contexts where tracks are
       … MSTG and MSTP will need to be available on workers when
       MST are
       … in terms of quoting Chris Needham on the Web Codecs
       decision - one of the motivation for main thread is the
       availability of other APIs on the main thread
       … transfering streams as a pipeline between origin and
       destination context - it assumes the source is main thread,
       but that's not true
       … with a camera, the source of the stream is the camera,
       not the main thread
       … otherwise, I like the shape of the API; it's very similar
       to what I proposed
       jib: I didn't mean to misrepresent these aspects; I see now
       that MSTG and MSTP are available in workers
       … but they're not transferable
       … so they would have to be created in the worker?
       harald: yes
       youenn: re slide 37
       … re not using tee because it's bad - I agree, but I hope
       we should be able to use it
       … with the example in slide 37, we lose back pressure
       … we might be able to add it back
       … in general, in terms of API shape, if we assume that we
       use streams, this is a good shape that solves some of the
       issues that I had with the prior proposal
       … in general, mediacapture-main has concepts of source and
       … having a JS object that represent the source is a good
       … similar to a readablestream that can be native or a JS
       … I think we should go there, will make it easier to extend
       the API and remove edge cases
       … I would prefer not to rely on tranferable streams, but
       instead rely on transferable MST
       … which creates a typed way of transferring that can help
       fulfill the requirements we need
       jib: my example may have a mistake on which track to clone
       - would flipping it around fix backpressure?
       youenn: I don't think so
       … introducing backpressure on the writablestream might do
       the trick
       harald: backpressure cannot deal with framerate
       [ [44]Slide 47 ]
       jib: tee can help with backpressure, at the cost of tee
       … the only thing odd is the "createFrameDropper", a
       transform stream to drop frames
       … clone/applyConstraint is a work around if we can't solve
       the tee problem
       bernard: slide-36 and -37 don't make sense to me
       jib: right, I wasn't aware that MSTP and MSTG were be
       available in workers
       … but you could still do this, and the situation would need
       to be handled
       … but Harald is right there is a lot of similarities
       between two proposals
       … the advantage is that we don't need to add a new object
       bernard: re slide 33
       … datachannels for instance is only available on the main
       … the lack of consistent API support in workers was part of
       the challenge
       jib: MSTG is a bit of an odd duck - it's also track
       … re lack of APIs, you can always transfer tracks back to
       the main thread when needed
       … this doesn't require breaking transferable streams
       Harald: if you have to tell some place upstream that you're
       frame is 30, then backpressure can't carry that information
       … backpressure can't tell the difference between "I'm
       slightly late" and "I want only every other frame"
       … we need to be able to carry these signals
       … we haven't gotten to it yet
       bernard: there may be several stages of reporting that's
       youenn: this depends on whether sources are push or pull
       … consumers need to propagate things up to the source
       … backpressure may not always be the right mechanism, but
       we need to support it
       … I also agree we need to fix carry backmessages
       … the fact that some of the APIs need to be done in the
       main thread is sad, but it still moves a lot of the heavy
       processing to workers, leaving only some of the plumbing on
       the main thread
       … there may be gaps to do good media processing - if so, we
       should make them available in workers, and this API would
       help accelerate that transition
       Guido: in addition to APIs availability on the main thread,
       we have first-hand feedback from app developers who WANT to
       do on main thread for their use cases
       … otherwise, the two APIs are equivalent beyond their shape
       dom: re use cases on main thread, is it a matter of
       developer experience?
       guido: for certain apps, adding workers in the mix is
       adding a cost, not a value
       … it only adds complexity and extra resource consumption
       jib: even there are such use cases, we're trying to protect
       a realtime media pipeline


Mediacapture Transform API
       [ [45]Slide 50 ]
       Harald: [summarizes the API of MSTP and MSTG]
       [ [46]Slide 51 ]
       Harald: it shipped in Chrome 94, it's actually used in
       products with new features based on it
       … very few problems reported on it
       [ [47]Slide 52 ]
       Harald: we believe the threading model is something that
       app developers need to pick, with encouragement from
       platform developers
       … but dictacting it is not the right approach
       … Streams are transferable objects
       … adding worker availability to MSTG and MSTP is a
       reasonable addition following the transferability of MST
       [ [48]Slide 53 ]
       Harald: we need to make sure we have samples that show
       realistic working real-time operations
       … including offthread processing
       [ [49]Slide 54 ]
       Harald: in terms of improvements, we need better control of
       adaptation source (backpressure, synchronizing streams,
       … we need to improve experience with streams that don't
       come from camera - not trivial to synchronize them
       … we can work on these aspects once we have agreed on a
       common base
       [ [50]Slide 55 ]
       Harald: the two proposals agree on Streams for frame
       … difference of opinion for availability on main thread
       … the proposals differ on whether the generator or the
       consumer expands MST or use a separate class
       … this can be discussed
       … another difference is that MSTG/MSTP is dealing with both
       audio and video
       … where jib is focused on video only
       … clear similarities on model, and distinctions that can be
       derived in specific issues
       jib: streams are transferable, but implicit transfer of the
       source isn't web compatible and we should go away from it
       harald: my interpretation is that the stream source is NOT
       on the main thread - e.g. it's attached to the camera
       jib: the optimizations that chrome has been doing is not
       compliant to spec AFAICT
       harald: I haven't been convinced the issue is not with the
       jib: the fact that this can't be optimized all the times
       would make this head scratching
       harald: I find the stream spec impossible to navigate -
       happy to get pointers
       jib: one of my slide covered the intent of the spec
       harald: but it relies on the interpretation that the source
       is in the main thread
       youenn: the algorithms described in the stream spec will
       need to be run in the context of the stream (not the
       … there is some leeway in the stream spec to optimize
       pipethrough et al
       … but not for the rest afaict
       … Adam Rice (stream editor) suggested a specific
       optimizable stream might be needed
       [ [51]Slide 38 ]
       jib: [quoting from the spec]
       … it's explicitly about transfer between realms
       [52]Transferable streams: the double transfer problem #1063
       jib: re exposure to main thread - for
       webrtc-encoded-transform, we agreed to focus off-thread
       harald: I have a bug open to allow to reenable it on main
       … I think this was a bad decision

     [52] https://github.com/whatwg/streams/issues/1063

Wrap up and next steps
       [ [53]Slide 55 ]
       bernard: I would like to get a sense of the room on the
       major distinctions between the 2 proposals
       jib: would also like to get a sense on whether my proposal
       is acceptable under what changes
       harald: we have 2 potential starting points, I don't see
       any reason to pick one over the other
       youenn: I want to reiterate my concerns about the difficult
       stream issues that I raised and for which I'm not seeing
       dom: I think the question is about API shape
       (readable/VideSource vs MSTP/MSTG)
       Cullen: I don't feel strongly about any of these questions,
       not knowing enough about the impact on implementations
       … I would need more background to give an informed opinion
       Bernard: So, we will bring these questions to the mailing
       Dom: ... after discussions with the chairs
       Minutes manually created (not a transcript), formatted by
       [54]scribe.perl version 147 (Thu Jun 24 22:21:39 2021 UTC).

     [54] https://w3c.github.io/scribe2/scribedoc.html

Received on Thursday, 14 October 2021 17:08:07 UTC