- From: François Daoust <fd@w3.org>
- Date: Wed, 14 Dec 2022 15:22:28 +0000
- To: "public-media-wg@w3.org" <public-media-wg@w3.org>
Hi all, The minutes (and slides) of this week's Media WG call are available at: https://www.w3.org/2022/12/13-mediawg-minutes.html ... and copied as raw text below. Thanks, Francois ----- Media WG Teleconference - 2022-12-13 13 December 2022 [2]Agenda. [3]IRC log. [2] https://github.com/w3c/media-wg/blob/main/meetings/2022-12-13-Media_Working_Group_Teleconference-agenda.md#agenda [3] https://www.w3.org/2022/12/13-mediawg-irc Attendees Present Alastor Wu, Bernard Aboba, Chris Needham, Dale Curtis, Eric Carlson, Francois Daoust, Frank Liberato, Harald Alverstrand, Jer Noble, Matt Wolenetz, Peter Thatcher, Sushanth Rajasankar, Youenn Fablet Regrets - Chair - Scribe cpn, tidoust Contents 1. [4]ITU-T SG16 Liaison statement on WebCodecs 2. [5]WebKit update on Audio focus/audio session API 3. [6]Consistent SVC metadata between WebCodecs and Encoded Transform API 4. [7]Media Pipeline architecture - Media WG input and WebRTC collaboration planning Meeting minutes ITU-T SG16 Liaison statement on WebCodecs cpn: We received an incoming liaison statement from ITU-T SG16. [8]https://github.com/w3c/media-wg/blob/main/liaisons/ 2022-10-28-itu-t-sg16.md <- Draft reply [8] https://github.com/w3c/media-wg/blob/main/liaisons/2022-10-28-itu-t-sg16.md cpn: Around WebCodecs, and also around new VVC codec. … I drafted a reply, which describes WebCodecs, the use cases, a few indications about our own plans such as current work on VideoFrame metadata registry. … I shared this. Got a thumbs up from Bernard, Jer, Paul. … I want to make sure that everything we write here is representative. … I was hoping to get this out before the Christmas break. … If you haven't had a chance to look at it yet, now would be a good time. youenn: I like the fact that you state that the group would be open to add a registration provided there was support from implementors. … I assume that means user agent implementors? cpn: That's a question for the group perhaps. H.263 comes to mind for instance. Dale_Curtis: I don't think that we want to be gatekeepers of what the registry contains, even though there isn't support in web browsers per se. … We'd still want some technical constraints to be met. cpn: Right. That would apply to any future registration as well. WebKit update on Audio focus/audio session API Slideset: [9]https://lists.w3.org/Archives/Public/www-archive/ 2022Dec/att-0000/AudioSession_API.pdf [9] https://lists.w3.org/Archives/Public/www-archive/2022Dec/att-0000/AudioSession_API.pdf [10][Slide 2] [10] https://lists.w3.org/Archives/Public/www-archive/2022Dec/att-0000/AudioSession_API.pdf#page=2 Youenn: We received reports that audio handling on iOS isn't easy, e.g., VC applications … The intent of the application may not match our heuristics for setting up the audio pipeline … So a new API may be appropriate … You might remember the Audio Focus API, initially in Media Session, then split out from that … There's an explainer, linked from the slides … The overall goal is to get feedback, is the scope right, next steps? … Compared to the original Audio Focus API, we wanted to reduce scope, for the iOS platform … We focused on the audio session category, and interruptions … The API should support future features such as requesting or abandoning audio focus … Handling audio providers as a group … We wrote an explainer, and a prototype in WebKit [11][Slide 3] [11] https://lists.w3.org/Archives/Public/www-archive/2022Dec/att-0000/AudioSession_API.pdf#page=3 Youenn: Some examples: setting the audio session category, you can open the demo in iOS … playAudio and capture functions, for microphone input … If you call playAudio initially, then capture, it's disruptive in iOS. The reason is that when you play using Web Audio, it's ambient … Two different audio levels when going from ambient to play & record. Something we want to avoid … The setCategory function allows you to set the category to play & record, don't use ambient [12][Slide 4] [12] https://lists.w3.org/Archives/Public/www-archive/2022Dec/att-0000/AudioSession_API.pdf#page=4 Youenn: On interruption, when you're in a video call, you might receive a phone call, which is higher priority, and the website is interrupted, capture stopped, audio or video elements may be stopped … But the website may not know that … It's also not clear to the website whether to restart audio after the phone call … Providing the concept of an audio session, which can go between active and interrupted, allows the website to change what is visible to the user … On an interruption, it could show a UI, or UI to allow the user to restart capture [13][Slide 5] [13] https://lists.w3.org/Archives/Public/www-archive/2022Dec/att-0000/AudioSession_API.pdf#page=5 Youenn: We tried to keep the API small. There's an audio session state and audio session type. Then we added an AudioSession interface, which we though was clearer … Use that to say it's ambient (mix with others), or play & record, so the UA can set the audio pipeline accordingly … There are event handlers, no constructor. For simple use cases, a getter on navigator to get the audio session … A default global audio session. Use this object to query or tailor it [14][Slide 6] [14] https://lists.w3.org/Archives/Public/www-archive/2022Dec/att-0000/AudioSession_API.pdf#page=6 Youenn: My main interest is not to go into specific issues. More issues are welcome … Question: is this of interest, is it going in the right direction? Any thoughts on potential next steps? Dale: From a Chrome point of view, Mounir and Becca worked on it. At a glance, seems reasonable. There might be worry about duplication between Media Session and Audio Session, but no specific thoughts on that Youenn: The API shape is different, there might be only one Media Session in a page, but only one Audio Session … The call to split the two things in the past is OK … We decided to delay the grabbing and releasing of audio focus. There might be other things to consider, e.g., auto play … A question I have, is it's not yet submitted in the WG. Is it already in scope? cpn: Looking at the charter, Audio Focus API is in the list of potential normative deliverables … We just need to run a call for consensus to adopt the spec to the Media WG Sushanth: How to handle audio from multiple tabs? Youenn: This would help with that Sushanth: If the audio type requested by one browser is playback, and from another is ambient, only one can exist at a time Youenn: You'd mimic what two native applications would do. One session with playback would probably not be interrupted by another that requests ambient cpn: At what point would we be ready to run a call for consensus on this? youenn: If there's already consensus in this call, we'd be interested to run it as soon as possible. … No particular hurry, but the sooner the better. … If there's no consensus, we'd like to know what to work on. cpn: Just worried about support from other browser vendors. youenn: We talked a bit with Mozilla. I can check with them and get back to you. alwu: From Mozilla Firefox perspective, that's an API we'd be interested in supporting as well. Dale: And no reason to hold off calling for consensus while we figure things out internally. jernoble: In the meantime, feedback on existing issues is welcome. cpn: So proposed resolution is to run a CfC. Consistent SVC metadata between WebCodecs and Encoded Transform API Slideset: [15]https://lists.w3.org/Archives/Public/www-archive/ 2022Dec/att-0002/MEDIAWG-12-13-2022.pdf [15] https://lists.w3.org/Archives/Public/www-archive/2022Dec/att-0002/MEDIAWG-12-13-2022.pdf [16][Slide 2] [16] https://lists.w3.org/Archives/Public/www-archive/2022Dec/att-0002/MEDIAWG-12-13-2022.pdf#page=2 [17][Slide 3] [17] https://lists.w3.org/Archives/Public/www-archive/2022Dec/att-0002/MEDIAWG-12-13-2022.pdf#page=3 [18][Slide 4] [18] https://lists.w3.org/Archives/Public/www-archive/2022Dec/att-0002/MEDIAWG-12-13-2022.pdf#page=4 [19][Slide 5] [19] https://lists.w3.org/Archives/Public/www-archive/2022Dec/att-0002/MEDIAWG-12-13-2022.pdf#page=5 Bernard: [going through slides]. Sequence of unsigned long dependencies. There's also some missing information. … We're essentially re-inventing WebCodecs in another spec, perhaps not the right way to go. … Two different SVC metadata dictionaries could be avoided. … Temporal may be shipping in Safari, but spatial is not shipping anywhere. Dale: I'm in favor of unifying what we can. Bernard: Proposal is for a few of us to get together and prepare a PR to harmonize things … This would at least avoid future issues. … We made some progress in the last couple of days, and Youenn prepared a bunch of PRs that solved a number of type mismatches. cpn: Is this something for the WebCodecs spec itself or the metadata registry? Bernard: This is for encoded metadata for which we don't have a registry. Media Pipeline architecture - Media WG input and WebRTC collaboration planning cpn: Back at TPAC, we identified several places where we may benefit from coordination between groups. … This is picking up on where we're at with this. [20][Slide 6] [20] https://lists.w3.org/Archives/Public/www-archive/2022Dec/att-0002/MEDIAWG-12-13-2022.pdf#page=6 Bernard: We created a Media Pipeline architecture repo following discussions. … Issues and pointers to sample code covering integration of next generation web media apis. … Also to go beyond just the specs we mentioned already, e.g. WebTransport which could be used to transport media. … From time to time, it's hard to undertand whether there are performance issues in the specs, implementations or in the code sample. [21][Slide 7] [21] https://lists.w3.org/Archives/Public/www-archive/2022Dec/att-0002/MEDIAWG-12-13-2022.pdf#page=7 Bernard: When I started of, I was thinking about capture with Media Capture and Streams Extensions, then encode/decode with WebCodecs (and also MSE v2 to some extent), Transport (WebTransport, WebRTC data channels in workers), and Frameworks (WHATWG streams, WASM) [22][Slide 8] [22] https://lists.w3.org/Archives/Public/www-archive/2022Dec/att-0002/MEDIAWG-12-13-2022.pdf#page=8 Bernard: The pipeline model is based on WHATWG Streams, through TransformStreams piped together. … When you're sending frames, you have a several options, e.g. reliable/unreliable, etc. … To stream these pipelines together, you have to use all of these APIs together. Does it all make sense? … I don't know that many developers who understand all of these APIs. [23][Slide 9] [23] https://lists.w3.org/Archives/Public/www-archive/2022Dec/att-0002/MEDIAWG-12-13-2022.pdf#page=9 Bernard: Some issues already created in the repo. [24]Media Pipeline architecture repo [24] https://github.com/w3c/media-pipeline-arch/ Bernard: A lot of the issues are focused on transport. … There are a few things that are worth discussing here. … E.g. rendering and timing. Media Capture Transform is an interesting API. Does VideoTrackGenerator have a jitter buffer? Does it not? … That is not particularly well defined in the spec. [25][Slide 10] [25] https://lists.w3.org/Archives/Public/www-archive/2022Dec/att-0002/MEDIAWG-12-13-2022.pdf#page=10 Bernard: We have two samples at the moment. One is a WebCodecs encode/decode in worker in the WebCodecs repo. … The second one adds WebTransport to that. This one took more work to optimize the transport. It adds serialization/deserialization. … We use frame/stream transport. That's not exactly RTP but it's close. … We're using SVC at baseline and partial reliability. … Overall, it's working surprisingly well. … I had to do a reorder buffer but still not a full jitter buffer. [26][Slide 11] [26] https://lists.w3.org/Archives/Public/www-archive/2022Dec/att-0002/MEDIAWG-12-13-2022.pdf#page=11 Bernard: Here are some of the things that you can play with. … You can play with this stuff. At the end, it generates a Frame RTT graph. That does not really give you glass to glass measurements. … Performances are pretty reasonable now after some work. [27][Slide 12] [27] https://lists.w3.org/Archives/Public/www-archive/2022Dec/att-0002/MEDIAWG-12-13-2022.pdf#page=12 Bernard: Slide shows an example with AV1 at full-HD. … What's interesting is that key frames can be transmitted within a single congestion window. … General question is what do we do with this? cpn: That's really great to get that practical feedback from building things. Bernard: Yes, we're seeing a lot of stuff. Similarly, there are a few things where I don't know enough of the internals to understand what needs to be done. … You have to be cautious of await calls with WHATWG Streams, since they are going to block. Debugging is also hard. youenn: Note you may use JS implementations or ReadableStream and WritableStream to ease debugging. Bernard: Good idea. You can get a dozen stages and you don't really know where things are in the different queues. It's not easy to figure out what happens. The code is fairly small though. cpn: Immediate next step? Bernard: Adding APIs in multiple groups adds question. It's worthwhile checking in on this periodically. … I don't want to act like I have a handle on this. cpn: OK, we'll talk more about how to improve that cross-group collaboration. cpn: Our next meeting will be on the new year. Happy Christmas and looking forward to seeing you next year!
Received on Wednesday, 14 December 2022 15:22:33 UTC