[minutes] Media Working Group Teleconference on WebCodecs - 2021-06-15

Hi all,

The minutes of today's call on WebCodecs are available at:

https://www.w3.org/2021/06/15-mediawg-minutes.html

... and copied as raw text below.

Main outcomes from discussions (see minutes and issues for details):

- Should WebCodecs be exposed in Window environments? (#211)
Chairs will issue a Call for Consensus on whether to defer decision on 
exposing main WebCodecs interfaces in Window context.

- Is exposing 
https://w3c.github.io/webcodecs/#enumdef-hardwareacceleration a good 
idea (#239)
We'll get back to PING to make sure that they are aware of it.

- Should ImageDecoder IsTypeSupported be (a)synchronous? (#213)
Lower priority and less stuck than the other two issues. Jer to align 
with Youenn and get back to the group.

Thanks,
Francois.

-----
Media WG - call on WebCodecs
15 June 2021

    [2]Agenda. [3]IRC log.

       [2] 
https://github.com/w3c/media-wg/blob/main/meetings/2021-06-15-Media_Working_Group_Teleconference-agenda.md#agenda
       [3] https://www.w3.org/2021/06/15-mediawg-irc

Attendees

    Present
           Chris Cunningham, Chris Needham, Dale Curtis, Dan
           Sanders, Eric Carlsson, Francois Daoust, Gary Katevman,
           Jan-Ivar Bruaroey, Jer Noble, Peng Liu, Youenn Fablet,
           Zachary Cava

    Chair
           Chris Needham, Jer Noble

    Scribe
           Francois

Contents

     1. [4]Issue #211 - Should WebCodecs be exposed in Window
        environments?
     2. [5]Issue #239 - Is exposing https://w3c.github.io/
        webcodecs/#enumdef-hardwareacceleration a good idea
     3. [6]Issue #213 - Should ImageDecoder IsTypeSupported be
        (a)synchronous?
     4. [7]Next meeting
     5. [8]Summary of action items

Meeting minutes

    cpn: Follow-up discussion to last week's call. Let's continue
    the discussion on window/worker first.

   Issue #211 - Should WebCodecs be exposed in Window environments?

    [9]Issue #211

       [9] https://github.com/w3c/webcodecs/issues/211

    cpn: I'd like to briefly recap where I think we are in terms of
    discussion and see if there's progress we can make during this
    call to find consensus.
    … I see both sides of the arguments. Strong arguments in either
    direction. I'd like to separate ImageDecoder from audio/video.
    Do the same concerns apply to image decoders?
    … Do we see that there is any potential risk to exposing image
    decoders on window?

    chcunningham: From our side, there are no specific concern.

    youenn: In general, image decoding is a one time operation.
    Being delayed by 100ms is probably fine. I have less concerns
    there. I haven't really looked at the API.

    dalecurtis: If we have no concern for Image decoders, we should
    clarify what our concerns are for the rest, as image decoding
    is at the basis of decoding p-frames and the like.
    … The APIs don't necessarily belong together but if we feel
    there's no issue with image decoder, then I question what our
    issues are with the rest.

    jib: When people talk about low-latency, if there's motion
    involved, would you rather have constant performance, or jank.

    dalecurtis: First paint latency, that's low latency that I had
    in mind in this issue.
    … If we want devs to improve first paint latency, then workers
    can be a hindrance, time taken to set things up.

    cpn: Setup cost would be a one time thing, right?

    dalecurtis: Yes. For multi-frame cases, the overhead per frame
    is the time to get to your worker and execute an async
    operation.
    … For single-frame cases, the setup cost is the main cost.

    <dalecurtis> [10]https://github.com/w3c/webcodecs/issues/
    211#issuecomment-860981641

      [10] 
https://github.com/w3c/webcodecs/issues/211#issuecomment-860981641

    chcunningham: The core of Apple's concern is performance and
    whether developers can be trusted to make the right calls for
    their app.
    … Two sub-issues: whether performance issue is real and whether
    developers can be trusted.
    … The developers that we've spoken to so far have shared an
    awareness of the API and inner workflow.
    … Many of them have done just workers, or just window, or both.
    … We're basing our conclusions on the experience that
    developers have.
    … We are also arguing that the performance issues have not
    materialized in our experience.
    … One final point is that there is tension on APIs that are or
    are not exposed to workers. There are draft specifications for
    things such as MediaStreamTrack, but the work is very much
    still at early stages

    jib: It's not about not trusting the devs, but more about
    having the right defaults for the devs.
    … In the past where there have been better models, there has
    been a lot of friction towards moving to the better but
    slightly more complex model. The better model needs to be
    imposed somehow.
    … Every feature we had in the insecure model removes some
    incentive to move to the default secure model.
    … So, rephrasing as "better defaults". We're not sure that the
    cost of abandoning better defaults is worth the benefits

    cpn: It seems to me that there are many APIs that could be used
    in conjunction with WebCodecs are not available on workers
    today. To what extent is that blocking?

    chcunningham: There aren't applications that cannot be built
    today and that could be built in the future when these APIs are
    exposed to workers.
    … The problem is performance, as you'd be using postMessage
    throughout between main thread and worker.
    … The complexity of being in a worker is a technical pain that
    we're hearing a lot from developers.

    cpn: This leads me to a comment from Dale. If we were to
    restrict the API to workers, under what conditions would we
    revisit that decision? What information beyond what we
    currently know do we need to see to inform a future decision on
    window?

    youenn: If we start with only workers, it's possible that
    someone writes a JS shim that would allow to do things on the
    main thread, paying the cost of postMessage.
    … We can look at inefficiencies with that approach and if the
    cost is really high. If it is, it may delay or forbid some
    applications.
    … Currently, exposing to window does not provide new feature.
    Having measurements would be useful. Gathering them would
    greatly help to evaluate whether exposing to window is very
    useful or marginally useful.

    dalecurtis: We have this origin trial, several developers
    speaking about their experience and performances.
    … You're saying that this is not enough data.

    chcunningham: I want to also mention how effective the strategy
    can be. If people start using the shim instead of switching to
    workers, they will pay a higher cost.
    … Jan-Ivar knocked that down with "we can remove the library if
    that happens". But once a library is out there, it's out there.

    jib: The first thing I would point out. Possible extreme
    outcomes: On one hand, people will not shim and abandon. On the
    other hand, putting the default to worker will be too hard and
    people will always use the shim.
    … I'm hoping that we can end up in the middle.
    … Problem is if we expose on window today, we can undo this
    later on.
    … Not exposing to window provides some incentive to move to
    workers right now, giving us some way to measure things
    properly.

    chcunningham: There are many cases for WebCodecs that are not
    in the WebRTC space, e.g. video editor.
    … We don't know how app developers will handle this for sure.
    The starting point is often that you find a library and if it
    addresses some friction such as working around the need to use
    to worker, then you'll go with it regardless of whether that is
    a good design decision.

    jib: The Web is full of dead libraries. As soon as browsers say
    "we made a mistake, we can expose to window", then libraries
    that use postMessage will disappear.

    chcunningham: My fear is if we continue the position that
    exposing to window is problematic, then the library will
    continue to be used.

    jib: If we fail to convince people, then we should expose to
    window.

    youenn: I agree.
    … [gives some WebRTC example for comparison]
    … In WebRTC, I hope that that we can use the same codepath for
    WebRTC or WebTransport + WebCodecs.
    … It's difficult as we're in the design phase.
    … If we screw things up today, then we're effectively screwed.
    Hard to rollback.

    chcunningham: I find the idea of making a totally different API
    for window than the one for workers totally unappealing.
    … 35 minutes in the call, I feel we're still looping.
    … I would like to encourage the chairs to escalate this to a
    CfC.

    jib: In the interest of new ideas, I thought there was an issue
    on real-time mode. Just on top of my head, could realtime mode
    be only exposed in worker and non realtime mode be exposed to
    worker and window?
    … Exposing the API to workers has consensus within the group.
    … I'm hoping that the CfC is not about that, but focused on
    exposure to window.

    cpn: I put forward a proposed resolution in the thread at the
    end of last discussion. Expose to window + provide developer
    guidance. If we move towards a vote, would we be happy for us
    to use that phrasing?
    … We could phrase it in the opposite direction. Defer
    exposition on window until we have clearer experience and
    feedback.

    jib: I would prefer that second formulation, as it leaves the
    door open.

    chcunningham: Just to clarify, I'm not asking for a realtime
    CfC, but one sent to the group's mailing-list.

    jib: Given the lack of documentation, the shipping calendar in
    Chrome is aggressive.

    youenn: I also prefer Jan-Ivar's formulation, which is clearer.
    Defer or now is clearer that we're not asking about taking a
    strong position on exposing to window now.
    … Ship it now or defer the decision to later.

    cpn: I would like to see some criteria on what the conditions
    would be for a later decision.

    jib: We could say use cases below a certain threshold of ms but
    I think that we will know if we've made the wrong decision in a
    year from now.
    … That's also the benefit of deferring. If there is lack of
    consensus, process-wise, that will create issue down the line.

    cpn: Personally, I'm sympathetic to that point of view. Having
    a decision that closes the door. Something that allows us with
    new information to revisit this would be preferable.
    … Chris, would you be ok with that formulation?

    chcunningham: Yes, as long as my vote can be "Do not defer"

   Issue #239 - Is exposing [11]https://w3c.github.io/
   webcodecs/#enumdef-hardwareacceleration a good idea

      [11] https://w3c.github.io/webcodecs/#enumdef-hardwareacceleration

    [12]Issue #239

      [12] https://github.com/w3c/webcodecs/issues/239

    chcunningham: Starts with Zoom and others saying "we really
    want to use your hardware codecs, and we really do not want to
    use your software codecs".
    … The difficulty is to expose that as the right granularity for
    privacy preserving reason.
    … [summarizes discussion on the issue]
    … Last comment from someone from Zoom. The takeaway is that
    these people have their own set of features and that they need
    to use hardward codecs.

    jernoble: Chair hat off. We have decided in past discussions
    not to expose hardware details.
    … To me personnally, that does not seem to be enough of a use
    case to warrant exposing them today.
    … Hardware or power efficient: For instance, Zoom could come to
    us and say that the Intel H264 chipset has a bug and that they
    want to know precisely where it is deployed.
    … We do not want to expose that.

    chcunningham: Zoom and VLC have their own implementation of
    codecs. There is the case of avoiding a bug as you suggest.
    That is not the only case.
    … There's also the "I know my code, I prefer to use my stuff
    unless you can guarantee me that you can give me hardware
    acceleration"

    youenn: They probably care more about power efficiency than
    hardware acceleration. Low-resolution, they may prefer their
    implementation. For HD, they will likely prefer hardware
    decoding.
    … Media Capabilities gives you that.
    … To increase fingerprinting, there should be a high
    motivation.
    … In that specific case, there are heuristics that you can use
    and we should clarify why this is not good enough.

    chcunningham: My issue with that argument is that there are two
    pieces. The stack for WebCodecs is not necessarily the same as
    the stack for RTC.
    … If the gain is 1% of increased fingerprinting, then it's not
    much.

    youenn: That should be brought to PING, who may look at it and
    realize that it is for a restricted set of users and push back.

    chcunningham: PING reviewed the specification and did not raise
    anything on this topic.
    … 1% was just to counter your 99%.

    youenn: Fingerprinters will use that API. There are good
    reasons to believe that this will be the main usage of
    WebCodecs. That is the case with WebRTC and that is very sad.

    chcunningham: I think that this is very speculative.
    … The main usage of Media Capabilities is not fingerprinting
    for instance.
    … Using MediaRecorder, apps can create a hash of the hardware.

    youenn: And that is fine. If the bits MediaRecorder create
    create an issue, we can address that later on and improve the
    output.
    … The specific boolean is exposing new information that is not
    currently exposed. Zoom would probably manage to come up with a
    good heuristic.

    chcunningham: If they can find a good heuristic, then the
    boolean does not really increase the fingerprinting surface.
    … The mitigations exist.

    youenn: If we start to do that mitigation, then many web sites
    will expect things to work in Safari and it won't.

    chcunningham: This is not true. Developers will have to be
    prepared for the fallback.

    youenn: The user may not be served well as the decision may
    then be per browser with less optimal decisions taken in some
    browsers.
    … I don't want Safari to be more privacy-preserving, I want the
    APIs to be privacy-preserving with Safari implementing them as
    planned.

    jib: I agree with Youenn that Web specs need to increase
    interoperability and to do that, we need to create
    abstractions.
    … If we expose hardware, we cannot compete on privacy since
    privacy is no longer part of the abstraction
    … PING raised a similar comment on WebRTC Stats. They may have
    missed that in WebCodecs, but we should probably raise that
    with them.
    … It would be good to close the loop with PING on that.

    <cpn> [13]https://www.w3.org/blog/2019/06/
    privacy-anti-patterns-in-standards/

      [13] 
https://www.w3.org/blog/2019/06/privacy-anti-patterns-in-standards/

    cpn: Tess dropped a link the other day by someone from PING
    that a new feature should not point to existing features as a
    justification for increasing the fingerprinting surface, as it
    prevents reducing the surface down the road.

    chcunningham: If we believe that this can be solved with
    existing features, then we are effectively locking us in
    providing reliable information for Media Capabilities.

    Action: chcunningham to reach out to PING regarding hardware
    acceleration

   Issue #213 - Should ImageDecoder IsTypeSupported be (a)synchronous?

    [14]Issue #213

      [14] https://github.com/w3c/webcodecs/issues/213

    chcunningham: Media Capabilities is departure from previous
    APIs in that we've been doing things asynchronously.
    … The argument for making ImageDecoder asynchronous is that
    image codecs are increasingly video. We may in the future rely
    on hardware systems, using background process, etc.
    … So we propose to make the method asynchronous

    jernoble: Chair hat off again. In Safari, all of these
    questions are answered in a separate process and that's fine.
    … Making things synchronous is not that of a big deal.
    … That is just one implementers experience of jumping to
    another process.
    … It reminds me of the discussion on the autoplay policy API.
    … I feel that it is not justifiable intrinsically that it gets
    done asynchronously.
    … A lot of things would be easier in my daily life if APIs were
    more synchronous.
    … But that is not what we should care about. Implementers are
    way down in the chain: end users, then dev, then...
    implementers.
    … I don't see the justification for imposing this on
    developers.

    cpn: Is it the same criteria as for the Autoplay Policy API?

    jernoble: From my point of view, that is the same situation.
    … Same arguments, same conclusion.

    chcunningham: I wasn't part of the autoplay discussion.
    … It surprises me that Safari would want to hide an
    asynchronous API under the hoods from developers. Being
    transparent about asynchronoucity seems a good thing for
    developers to improve their performances.

    jernoble: We just solve this with caching.
    … XPC != asynchronous
    … When I say that something happens in another process, it is
    not necessarily intrinsically asynchronous.

    jernoble: I know that Youenn and I disagree here. Strictly
    speaking for myself, this does not meet the muster of making
    something asynchronous
    … When autoplay came up before, the async keyword was not
    available in all platforms that were being targeted.
    … Nested promises were a cost.

    chcunningham: About caching?

    jernoble: Talking specifically about canPlayType. We don't
    cache specific responses about codec strings, but caches about
    container types, and that maps pretty well with the API space
    that we're targeting here.
    … These can be cached.
    … We push the answer to that answer to every process around. We
    pay the cost once and it is never paid again.
    … That is an implementation detail that may not apply to other
    implementations.

    chcunningham: The people who own the part of the chrome startup
    would not allow me to hold these responses in cache. At best,
    for some calls, there will be some cache miss.

    chcunningham: If Youenn and you disagree on this point, should
    we let you align on a position with Youenn?

    jernoble: I was willing to convince everyone on this call.

    cpn: From a developers points of view, we have a number of APIs
    that do similar types of queries and that are all synchronous.
    Is consistency a good argument?

    chcunningham: It's a mixed bag. We have both sync and async in
    practice.

    cpn: Wondering about next steps.

    chcunningham: This conversation is a little bit less stuck and
    less high priority.
    … Fine with giving Jer time to exchange with Youenn and come
    back to the group.

   Next meeting

    cpn: Do we feel we've discussed these issues enough?

    jernoble: Would you like to see some follow-up discussion on
    the second issue in particular, chcunningham?

    chcunningham: We'll go back to would-be users of this feature
    and perhaps call for another discussion afterwards.

    cpn: OK, we're available to organize a meeting if and as
    needed.

Summary of action items

     1. [15]chcunningham to reach out to PING regarding hardware
        acceleration

Received on Tuesday, 15 June 2021 23:22:39 UTC