[minutes] Media WG call - 2023-04-11 from Francois Daoust on 2023-04-12 (public-media-wg@w3.org from April 2023)

From: Francois Daoust <fd@w3.org>
Date: Wed, 12 Apr 2023 09:01:15 +0000
To: "public-media-wg@w3.org" <public-media-wg@w3.org>
Message-Id: <emc07466b5-58f8-41c7-9b5f-d917c97ca7f5@4c42f12f.com>
The minutes of yesterday's Media WG call are available at:
https://www.w3.org/2023/04/11-mediawg-minutes.html

Text version below.

The group converged on possible next steps on 3 WebCodecs issues (#656, 
#646, #619), and discussed the idea of extending the scope of the Media 
WG to adopt Document Picture-in-Picture as a potential normative 
deliverable (overall feeling in the call being that the proposal would 
be a better fit for a non-media-focused group).

Thanks,
Francois.

-----
Media WG call
11 April 2023

    [2]Agenda. [3]IRC log.

       [2] 
https://github.com/w3c/media-wg/blob/main/meetings/2023-04-11-Media_Working_Group_Teleconference-agenda.md
       [3] https://www.w3.org/2023/04/11-mediawg-irc

Attendees

    Present
           Bernard Aboba, Chris Needham, Dale Curtis, Eugene
           Zemtsov, Francois Daoust, Jer Noble, Peter Thatcher

    Chair
           Chris, Jer

    Scribe
           cpn, tidoust

Contents

     1. [4]Allow decoder to ignore corrupted frames
     2. [5]Allow configuration of AV1 screen content coding tools
     3. [6]Extend EncodedVideoChunk metadata for SVC
     4. [7]Media WG rechartering
     5. [8]TPAC 2023 joint meetings

Meeting minutes

   Allow decoder to ignore corrupted frames

    Slideset: [9]https://lists.w3.org/Archives/Public/www-archive/
    2023Apr/att-0000/MEDIAWG-04-11-23.pdf

       [9] 
https://lists.w3.org/Archives/Public/www-archive/2023Apr/att-0000/MEDIAWG-04-11-23.pdf

    [10][Slide 2]

      [10] 
https://lists.w3.org/Archives/Public/www-archive/2023Apr/att-0000/MEDIAWG-04-11-23.pdf#page=2

    [11][Slide 3]

      [11] 
https://lists.w3.org/Archives/Public/www-archive/2023Apr/att-0000/MEDIAWG-04-11-23.pdf#page=3

    Bernard: WebCodecs encoder and decoder errors are fatal. It
    queue's a task to close the encoder/decoder
    … The issue is: Is there some way to not close?

    <tidoust> [12]#656

      [12] https://github.com/w3c/webcodecs/issues/656

    <ghurlbot> [13]Issue 656 Allow decoder to ignore corrupted
    frames (matanui159) agenda

      [13] https://github.com/w3c/webcodecs/issues/656

    [14][Slide 4]

      [14] 
https://lists.w3.org/Archives/Public/www-archive/2023Apr/att-0000/MEDIAWG-04-11-23.pdf#page=4

    Bernard: Dale clarified that perhaps the text could clarify
    fatal vs non-fatal
    … The safest thing to do could be to close it

    Dale: All errors are fatal, not sure why I wrote just software
    there

    Bernard: It doesn't interfere with resiliance, FEC, redundant
    error coding, etc
    … Discussed if we could have more tests for sending errors
    … Chromium reports the wrong error type

    Dale: We have a bug open to fix that

    Bernard: Is it a bug? Is more info needed in the error?
    … Question: should all errors be fatal? Dale makes a good point
    why they should be
    … Potential issues with security team review if we don't make
    it fatal

    Dale: If the author wants to handle the error and resume the
    decoding, could be for them to decide

    Bernard: But they can't if it's closed

    Dale: They can create a new decoder

    Bernard: Yes. That would require a keyframe
    … Second question: There are various reasons why you could get
    an error. Hardware decoders may error where a sofware decoder
    wouldn't
    … Hardware resources could be acquired by another app.
    Reconfigure with prefer-hardware could then fail, then you'd
    have to fall back to software
    … Paul asked what's the difference between reset and close, and
    impact on performance?

    Bernard: Any opinions? I've had developers ask whether there's
    truly been a decoder error, or something else, such as a GPU
    crash
    … Does only having EncodingError provide enough info?

    Dale: We're limited on the information available to us. Where a
    software decoder is more permissive it's in a way non-compliant
    to the spec
    … I though we decided among editors that it should be fatal

    Bernard: Is there any objection in the WG to that?

    (none)

    [15][Slide 5]

      [15] 
https://lists.w3.org/Archives/Public/www-archive/2023Apr/att-0000/MEDIAWG-04-11-23.pdf#page=5

    Bernard: So what to do next? Reconfiguring with prefer-hardware
    could fail.

    Dale: Developers would have to handle it either way. We could
    provide more docs to advise on use a more professional analysis
    tool
    … or ffmpeg

    Bernard: Is EncodingError right?
    … Any other changes to the spec?

    Dale: No spec change, just MDN documentation improvements. My
    team have been working on that

    Eugene: Could add a new optional exception such as corrupted
    stream or corrupted chunk, to say it's something to do with the
    stream
    … and not some kind of infrastructure issue underneath

    Bernard: That would be helpful though
    … Developers would appreciate that. Would it be an error
    message inside EncoderError?

    Eugene: That error type doesn't sound right

    Dale: I'm not sure why we didn't add that. Technically it's an
    error in the encoding...

    Bernard: Not sure it's a requirement to change the type, but
    the extra info would be useful

    Eugene: If we can see the data is noncompliant, and maybe an
    OperationError when it's a GPU crash or out of resources, would
    be useful distinction
    … Hardware decoders don't always give the reason for error
    though

    Bernard: Next step would be to see if that's feasible and
    prepare a PR if so

    Eugene: I can do that

   Allow configuration of AV1 screen content coding tools

    [16][Slide 6]

      [16] 
https://lists.w3.org/Archives/Public/www-archive/2023Apr/att-0000/MEDIAWG-04-11-23.pdf#page=6

    <tidoust> [17]#646

      [17] https://github.com/w3c/webcodecs/issues/646

    <ghurlbot> [18]Issue 646 Support for Screen Content Coding
    (aboba) PR exists

      [18] https://github.com/w3c/webcodecs/issues/646

    Bernard: We've added a PR to initialise the AV1 quantizer

    <tidoust> PR [19]#662

      [19] https://github.com/w3c/webcodecs/issues/662

    <ghurlbot> [20]Pull Request 662 Enable configuration of AV1
    screen content coding tools (aboba)

      [20] https://github.com/w3c/webcodecs/issues/662

    Bernard: This PR adds a boolean for forceScreenContentTools.
    Default is false
    … when true, the AV1 spec sets a seek force screen content
    tools, then it uses the palette and block copy tools

    [21][Slide 7]

      [21] 
https://lists.w3.org/Archives/Public/www-archive/2023Apr/att-0000/MEDIAWG-04-11-23.pdf#page=7

    Bernard: The PR adds this boolean attribute, default false,
    with an explanation

    Eugene: Difference with the other PR?

    Bernard: I rebased it due to a merge conflict

    Eugene: Another one proposed adding the flag to
    VideoEncodingConfig. The distinction is you configure it once,
    but now you'd set it per-frame

    Bernard: Quantizer is per-frame as well

    Eugene: So need to decide where it goes: per frame or not

    Bernard: Should be per frame. Content could change during
    screen capture, e.g., go from slides to a video presentation
    where you'd disable screen content tools
    … I closed the other PR

    Eugene: It belongs per-frame. I checked libaom, it does allow
    setting per-frame

    Bernard: If can change, but how you know to change it is a
    different story...

    Dale: Quantizer makes sense as an encoder, but the screen tools
    feels like a per-frame metadata thing
    … Then under the hood we'd automatically do the right thing

    Bernard: WebRTC does it that way, by checking whether it comes
    from a screen - but could be a game or sports event which are
    not amenable to screen content tools

    Eugene: IIRC, we'll have the same for VP9

    Bernard: I guess so, the AV1 tools are more sophisticated.
    Would that use the same kind of parameter?

    Eugene: As far as I know there's a global setting, not per
    frame

    Bernard: HEVC has screen content tools, but it's hardware only,
    so doesn't make sense to add it, as it woulnd't be used
    … Looking at the WebRTC code, it mostly changed the quantizer

    Chris: So proposed resolution is to add this per-frame for AV1,
    then consider VP9 separately. It's very much codec-specific

   Extend EncodedVideoChunk metadata for SVC

    [22][Slide 8]

      [22] 
https://lists.w3.org/Archives/Public/www-archive/2023Apr/att-0000/MEDIAWG-04-11-23.pdf#page=8

    <tidoust> [23]#619

      [23] https://github.com/w3c/webcodecs/issues/619

    <ghurlbot> [24]Issue 619 Consistent SVC metadata between
    WebCodecs and Encoded Transform API (aboba) agenda, PR exists

      [24] https://github.com/w3c/webcodecs/issues/619

    <tidoust> PR [25]#654

      [25] https://github.com/w3c/webcodecs/issues/654

    <ghurlbot> [26]Pull Request 654 Extend
    EncodedVideoChunkMetadata for Spatial SVC (aboba)

      [26] https://github.com/w3c/webcodecs/issues/654

    Bernard: For background: In WebCodecs we support temporal
    scalablity, and WebRTC supports temporal and spatial
    scalability
    … In the WebRTC encoded transform, we provide metadata for
    these encoded frames
    … WebCodecs has an encodeded metadata, but there's a mismatch -
    if you want temporal scalability you need to use the WebRTC API
    … Since it's in WebRTC, why not bring it also to WebCodecs?

    [27][Slide 9]

      [27] 
https://lists.w3.org/Archives/Public/www-archive/2023Apr/att-0000/MEDIAWG-04-11-23.pdf#page=9

    Bernard: First question is compare WebCodecs and WebRTC APIs.
    WebCodecs API is more structured, not just one big blob of
    stuff as in WebRTC

    [28][Slide 10]

      [28] 
https://lists.w3.org/Archives/Public/www-archive/2023Apr/att-0000/MEDIAWG-04-11-23.pdf#page=10

    Bernard: The SVC sub-dictionary design has a frame number
    (unsigned short), not the same as frame id
    … frameId is a globally unique id for the frame
    … It's something you want to serialize on the wire, so the
    sender and receiver could want the information. Frame number is
    a modulo 2^16 of the frame id, to use less space in the wire
    serialization
    … When you describe dependencies you are referencing a series
    of frame numbers
    … In real life you typically don't have 2^16 or 2^32 frame-ago
    dependencies
    … Different names compares to the WebRTC version
    … Decode targets and chain links. A forwarder keeps state, and
    the frame rate and spatial resolution for a particular client.
    This determines what layers I forward to that client
    … It's useful to have this from the encoder, as the forwarder
    has state about the target, and compare against the frame
    itself
    … It makes it possible forwarder to quickly decide whether to
    forward or not
    … Protect the WebCodecs decoder against things that would cause
    a decoder error
    … Chain links. If you get a frame as a receiver, with
    dependencies, is it true that if I submit it ot the WebCodecs
    decoder I should get an error?
    … The dependencies might have dependencies? So you'd still get
    a decoder error
    … Chain links look at the whole chain of dependencies, and see
    if you'd get an error
    … Easier for the encoder to send the data than for the receiver
    to calculate the dependency graph, and avoid duplicate work
    across each client
    … We're thinking this should go in the SVC dictionary
    … One thing is there's no unsigned short, just unsigned long.
    Is this headed in the right direction?

    Eugene: As a superficial comment, we should everything as
    unsigned long. It wouldn't cost anything

    Bernard: Agree

    Eugene: I'm not sure when this would be implemented

    Bernard: It's in the Chromium code base. The SVC modes and
    depedency descriptor is there

    Dale: I think we'd need to check what our encoders produce. We
    should get one software encoder working before landing the spec
    change

    [29][Slide 11]

      [29] 
https://lists.w3.org/Archives/Public/www-archive/2023Apr/att-0000/MEDIAWG-04-11-23.pdf#page=11

    [30][Slide 12]

      [30] 
https://lists.w3.org/Archives/Public/www-archive/2023Apr/att-0000/MEDIAWG-04-11-23.pdf#page=12

    [31][Slide 13]

      [31] 
https://lists.w3.org/Archives/Public/www-archive/2023Apr/att-0000/MEDIAWG-04-11-23.pdf#page=13

    Bernard: I think it's possible to enable it. Next steps?

    Dale: Can you share a link to the Chromium code?

    Bernard: Will do

    Chris: Prototyping as Dale suggested?

    Bernard: It's in WebRTC spec, concern about the alternative
    version showing up in WebRTC

    Dale: I feel like having it working end to end, either WebRTC
    or WebCodecs, would be good

    Bernard: If the WG approves the approach, could follow it up in
    WebRTC

    Dale: Would prefer to see it working first though

    Eugene: We only have the temporal layer ID, as that's the only
    thing implemented, and we can add more dictionary entries when
    we're happy

    Bernard: We could submit a PR to remove from WebRTC, then when
    it's implemented add it in both places
    … It's dangeous for both to be out of sync

    Chris: Has this shipped in WebRTC or is it only a spec change
    for now?

    Bernard: Would have to check

    Chris: Happy to organise joint meeting discussion between both
    groups

    Chris: So check if shipped, get working end to end in either
    WebRTC and WebCodecs, then ensure specs are consistent across
    both

   Media WG rechartering

    Repository: w3c/charter-media-wg

    [32]#38

      [32] https://github.com/w3c/charter-media-wg/issues/38

    <ghurlbot> [33]Issue 38 Document Picture-in-Picture
    (steimelchrome)

      [33] https://github.com/w3c/charter-media-wg/issues/38

    <Github> [34]w3c/media-wg#38 : Add webcodecs quantizer mode to
    agenda listing.

      [34] https://github.com/w3c/media-wg/pull/38

    cpn: Open question related to rechartering and the Document
    Picture-in-Picture.
    … The Media WG was suggested in the TAG review as venue for
    Recommendation track progress.
    … Suggestion at this stage would not be that it become a WG
    deliverable but rather a potential normative deliverable as
    we've done in the past with the Audio Focus API.
    … With the goal being to avoid rechartering when spec is ready
    to enter the Recommendation track.
    … Question here is whether we should consider the document to
    our scope as a potential normative spec.
    … François pointed out that this would change the scope of the
    WG a little, because the group is focused on media related
    features.
    … Document PiP is broader in scope.
    … While it has interest from media companies to using it for
    media content, it's not restricted to that at all.
    … Question is: is the Media WG the right venue to continue work
    on the spec?
    … In favor: Picture-in-Picture is developed by the Media WG
    … I have a concern about the broader scope though.

    jernoble: The original media element supported fullscreen mode.
    The Fullscreen API is a more general purpose API. I think that
    is a WHATWG deliverable and that there are lots of parallel
    there.
    … As an implementer, I don't know that working on the spec in
    the Media WG makes sense.

    cpn: I think François suggested an alternate ohme for the spec,
    WebApps WG.
    … I don't think there's a barrier to landing that on the
    Recommendation track somewhere.
    … I'll just start an email thread with Tommy and chairs to
    thrash this out.
    … We don't want to hold up too much on that because draft
    charter is mostly ready otherwise.

    Dale: Most use cases are media-related but argument and
    comparison with Fullscreen makes sense to me.

    jernoble: If it becomes necessary to integrate with other
    specs, such as CSS, etc. it also makes sense to use a group
    that's more used to doing that.

    cpn: I agree. My proposal would be not to do it here.

    jernoble: Even more importantly, I think it would be even more
    successful in another group.

   TPAC 2023 joint meetings

    cpn: I think what I'm going to propose for our group is similar
    to last year: full morning or full afternoon to go through
    discussions. The other thing I'm interested in exploring is
    joint meetings.

    Bernard: Ongoing poll to figure out who's going to show up
    in-person. If not enough people, my sense is that W3C
    guidelines are not to request TPAC time.
    … I'll be remote.

    cpn: I'm planning to be there in-person.

    tidoust: I wouldn't worry too much about the requirement for
    number of people. The venue is large enough

    Bernard: Joint meeting with WebRTC will be useful. Lots of
    ongoing discussions.

    cpn: I'll follow-up via email on the specifics of that.
Received on Wednesday, 12 April 2023 09:01:23 UTC