Re: [media-and-entertainment] Frame accurate seeking of HTML5 MediaElement (#4)

A different, but related, issue is accuracy of audio *splicing* with MSE.
For both seek and splicing, assuming you have audio timestamps in the media
- which I think you usually do - it happens at the start of a coded audio
frame. MSE is explicit about this for splicing, with a cross-fade specified
to happen in the first 5ms of the first coded frame of the new (spliced)
stream. It would be good for both seek and splice to be able to have
that happen at a precise time.

...Mark

On Wed, May 13, 2020 at 5:05 AM Nigel Megitt via GitHub <sysbot+gh@w3.org>
wrote:

> This issue was originally raised for the general HTML media element, and
> the discussion has mainly been about video elements. I have just come upon
> another use case, for audio elements. Setting `currentTime` on an audio
> element whose resource is a WAV file works well. However when the resource
> is an MP3 file the accuracy is very poor (I checked on Chrome and Firefox).
>
> I'm pretty sure the cause is something that occurs in general with
> compressed media, either audio or video: depending on the file format, it
> can be complex to work out where in the compressed media to seek to in
> order to get to an arbitrary desired point. I guess some kind of heuristic
> is generally used.
>
> When there are no timestamps within the compressed media, that's even
> harder, and of course such timestamps would reduce the efficiency of the
> compression. Effectively the only way to do it reliably is to play back the
> entire media, which might be very long, and generate a map that connects
> audio sample count to file location.
>
> Clearly doing that would be a costly operation, in general. Nevertheless,
> perhaps there is some processing that can be done to try to improve the
> heuristics, without doing a full decode? An API call to pre-process the
> media to generate such a map could provide an opt-in for applications that
> need it, without imposing it on those applications that do not need it.
>
> MDN doesn't really hint about the seek accuracy of audio codecs at
> https://developer.mozilla.org/en-US/docs/Web/Media/Formats/Audio_codecs
> and it looks like the HTMLMediaElement interface itself doesn't offer this
> kind of accurate seek preparation; there is perhaps an analogy with the
> `preload` attribute that defines how much data to load, but it is clearly a
> different thing.
>
> --
> GitHub Notification of comment by nigelmegitt
> Please view or discuss this issue at
> https://github.com/w3c/media-and-entertainment/issues/4#issuecomment-627936214
> using your GitHub account
>
>

Received on Wednesday, 13 May 2020 15:38:33 UTC