Re: [whatwg] Memory management problem of video elements

于 2014年08月19日 20:23, Philip Jägenstedt 写道:
> On Tue, Aug 19, 2014 at 11:56 AM, duanyao <duanyao@ustc.edu> wrote:
>> 于 2014年08月19日 16:00, Philip Jägenstedt 写道:
>>
>>> On Tue, Aug 19, 2014 at 9:12 AM, duanyao <duanyao@ustc.edu> wrote:
>>>> Hi,
>>>>
>>>> Recently I have investigated memory usage of HTML video element in
>>>> several desktop browsers (firefox and chrome on windows and linux, and
>>>> IE 11), and have found some disappointing results:
>>>>
>>>> 1. A video element in a playable state consumes significant amount of
>>>> memory. For each playing or paused or preload=auto video element, the
>>>> memory usage
>>>> is up to 30~80MB; for those with preload=metadata, memory usage is
>>>> 6~13MB; for those with preload=none, memory usage is not notable. Above
>>>> numbers are measured with 720p to 1080p H.264 videos, and videos in
>>>> lower resolutions use less memory.
>>>>
>>>> 2. For a page having multiple video elements, memory usage is scaled up
>>>> linearly. So a page with tens of videos can exhaust the memory space of
>>>> a 32bit browser. In my tests, such a page may crash the browser or
>>>> freeze a low memory system.
>>>>
>>>> 3. Even if a video element is or becomes invisible, either by being out
>>>> of viewport, having display:none style, or being removed from the active
>>>> DOM tree (but not released),
>>>> almost same amount of memory is still occupied.
>>>>
>>>> 4. The methods to reduce memory occupied by video elements requires
>>>> script, and the element must be modified. For example, remove and
>>>> release the element.
>>>>
>>>> Although this looks like a implementors' problem, not a spec's problem,
>>>> but I think the current spec is encouraging implementors to push the
>>>> responsibility of memory management of media elements to authors, which
>>>> is very bad. See the section 4.8.14.18
>>>>
>>>> (http://www.whatwg.org/specs/web-apps/current-work/multipage/embedded-content.html#best-practices-for-authors-using-media-elements):
>>>>
>>>>> 4.8.14.18 Best practices for authors using media elements
>>>>> it is a good practice to release resources held by media elements when
>>>> they are done playing, either by being very careful about removing all
>>>> references to the element and allowing it to be garbage collected, or,
>>>> even better, by removing the element's src attribute and any source
>>>> element descendants, and invoking the element's load() method.
>>>>
>>>> Why this is BAD in my opinion?
>>>>
>>>> 1. It requires script. What if the UA doesn't support or disables script
>>>> (email reader, epub reader, etc), or the script is simply failed to
>>>> download? What if users insert many video elements to a page hosted by a
>>>> site that is not aware of this problem (so no video management script
>>>> available)? Users' browsers may be crashed, or systems may be freezed,
>>>> with no obvious reason.
>>>>
>>>> 2. It is hard to make the script correct. Authors can't simply depend on
>>>> "done playing", because users may simply pause a video in the middle and
>>>> start playing another one, and then resume the first one. So authors
>>>> have to determine which video is out of viewport, and remove its src,
>>>> and record its currentTime; when it comes back to viewport, set src and
>>>> seek to previous currentTime. This is quite complicated. For WYSIWYG
>>>> html editors based on browsers, this is even more complicated because of
>>>> the interaction with undo manager.
>>>>
>>>> 3. Browsers are at a much better position to make memory management
>>>> correct. Browsers should be able to save most of the memory of an
>>>> invisible video by only keep its state (or with a current frame), and
>>>> limit the total amount of memory used by media elements.
>>>>
>>>> So I think the spec should remove section 4.8.14.1, and instead stresses
>>>> the the responsibility of UA to memory management of media elements.
>>> What concrete advice should the spec give to UAs on memory management?
>>> If a script creates a thousand media elements and seeks those to a
>>> thousand different offsets, what is a browser to do? It looks like a
>>> game preparing a lot of sound effects with the expectation that they
>>> will be ready to go, so which ones should be thrown out?
>>
>> UA can limit the number of simultaneously playing medias according to
>> available memory or user preference,
>> and fire error events on media elements if the limit is hit. We may need
>> another error code, currently some UAs fire MEDIA_ERR_DECODE,
>> which is misleading.
> Opera 12.16 using Presto had such a limit to avoid address space
> exhaustion on 32-bit machines, limiting the number of concurrent media
> pipelines to 200. However, when the limit was reached it just acted as
> if the network was stalling while waiting for an existing pipeline to
> be destroyed.
>
> It wasn't a great model, but if multiple browsers (want to) impose
> limits like this, maybe a way for script to tell the difference would
> be useful.

I think it is even better for UA to play the media element that the 
user/script tried to play most recently, and drop pipelines for those 
are paused and/or invisible.

P.S. I forgot to say that UAs that fire MEDIA_ERR_DECODE event for 
not-enough-memory error also show error message "decode error"
on the UI of video elements, which confuse users too.
>> If the thousand media elements are just sought, not playing, UA can seek
>> them one by one, and drop cached frames afterwards, only keep current
>> frames;
>> if memory is even more limited, the current frames can also be dropped.
>>
>> For a html based slideshows or textbooks, it is quite possible to have tens
>> of videos in one html file.
>>
>> For audio elements, I think it is less problematic because they usually use
>> far less memory than videos.
>>
>>> A media element in an active document never gets into a state where it
>>> could never start playing again, so I don't know what to do other than
>>> trying to use less memory per media element.
>> What do you mean by "a state where it could never start playing again"?
> I mean a state where no user action or script could cause the media
> element to show any frames or play any audio again, i.e. a state where
> it would be entirely safe to shut down the media pipeline and
> otherwise free resources. Other than that state, only currentTime=0
> and currentTime=duration are really simple to recreate.
>
>> If the media element object keeps track of its current playing url and
>> current position (this requires little memory), and the media file is
>> seekable, then
>> the media is always resumable. UA can drop any other associated memory of
>> the media element, and users will not notice any difference except a small
>> delay
>> when they resume playing.
> That small delay is a problem, at least when it comes to audio
> elements used for sound effects. For video elements, there's the
> additional problem that getting back to the same state will require
> decoding video from the previous keyframe, which could take several
> seconds of CPU time.
>
> Of course, anything is better than crashing, but tearing down a media
> pipeline and recreating it in the exact same state is quite difficult,
> which is probably why nobody has tried it, AFAIK.
UA can pre-create the media pipeline according to some hints, e.g. the 
video element is becoming visible,
so that the delay may be minimized.

There is a load() method on media element, can it be extended to 
instruct the UA to recreate
the media pipeline? Thus script can reduce the delay if it knows the 
media is about to be played.

Audios usually eat much less memory, so UAs may have a different 
strategy for them.

Many native media players can save playing position on exit, and resume 
the playing from that position on the next run.
Most users are satisfied with such feature. Is recovering to "exact same 
state" important to some web applications?

I'm not familiar with game programing. Are sound effects small audio 
files that are usually
played as a whole? Then it should be safe to recreate the pipeline.
>>> Have you filed bugs at
>>> the browsers that crash or freeze the system?
>> Yes, for firefox: https://bugzilla.mozilla.org/show_bug.cgi?id=1054170
> Thanks, let's hope the crash is fixed at the very least.
>
>>> Regardless of what the UA does, section 4.8.14.1 is still good advice
>>> when the script knows that a resource won't be needed but the browser
>>> cannot. Example: a sound effect is played for last time in a game as
>>> the last secret in the level is found.
>> I think UAs can manage this situation quite well if a LRU mechanism is used
>> to reclaim media elements' associated memory.
>> In a some games, which sound effect is to be played is determined by user
>> actions, so game developers have to implement their own LRU
>> if UAs don't provide it.
>>
>> If a page use many images, it may also consume a lot of memory. Does the
>> spec want to add a similar advice for img element? I don't think so.
> OK. See https://www.w3.org/Bugs/Public/show_bug.cgi?id=11243 for how
> that advice landed in the spec.

Even if the spec really want authors to manage media elements' memory, I 
think it should provide a more
convenient method, e.g. unload() or release(), and don't require authors 
to modify the element.
>
> TL;DR: There is a problem but I don't know how to fix it well or what
> the spec could meaningfully say about it.
>
> Philip
Duan Yao

Received on Tuesday, 19 August 2014 13:56:18 UTC