[whatwg] Memory management problem of video elements

Hi,

Recently I have investigated memory usage of HTML video element in
several desktop browsers (firefox and chrome on windows and linux, and
IE 11), and have found some disappointing results:

1. A video element in a playable state consumes significant amount of
memory. For each playing or paused or preload=auto video element, the
memory usage
is up to 30~80MB; for those with preload=metadata, memory usage is
6~13MB; for those with preload=none, memory usage is not notable. Above
numbers are measured with 720p to 1080p H.264 videos, and videos in
lower resolutions use less memory.

2. For a page having multiple video elements, memory usage is scaled up
linearly. So a page with tens of videos can exhaust the memory space of
a 32bit browser. In my tests, such a page may crash the browser or
freeze a low memory system.

3. Even if a video element is or becomes invisible, either by being out
of viewport, having display:none style, or being removed from the active
DOM tree (but not released),
almost same amount of memory is still occupied.

4. The methods to reduce memory occupied by video elements requires
script, and the element must be modified. For example, remove and
release the element.

Although this looks like a implementors' problem, not a spec's problem,
but I think the current spec is encouraging implementors to push the
responsibility of memory management of media elements to authors, which
is very bad. See the section 4.8.14.18
(http://www.whatwg.org/specs/web-apps/current-work/multipage/embedded-content.html#best-practices-for-authors-using-media-elements):

>4.8.14.18 Best practices for authors using media elements
>it is a good practice to release resources held by media elements when
they are done playing, either by being very careful about removing all
references to the element and allowing it to be garbage collected, or,
even better, by removing the element's src attribute and any source
element descendants, and invoking the element's load() method.

Why this is BAD in my opinion?

1. It requires script. What if the UA doesn't support or disables script
(email reader, epub reader, etc), or the script is simply failed to
download? What if users insert many video elements to a page hosted by a
site that is not aware of this problem (so no video management script
available)? Users' browsers may be crashed, or systems may be freezed,
with no obvious reason.

2. It is hard to make the script correct. Authors can't simply depend on
"done playing", because users may simply pause a video in the middle and
start playing another one, and then resume the first one. So authors
have to determine which video is out of viewport, and remove its src,
and record its currentTime; when it comes back to viewport, set src and
seek to previous currentTime. This is quite complicated. For WYSIWYG
html editors based on browsers, this is even more complicated because of
the interaction with undo manager.

3. Browsers are at a much better position to make memory management
correct. Browsers should be able to save most of the memory of an
invisible video by only keep its state (or with a current frame), and
limit the total amount of memory used by media elements.

So I think the spec should remove section 4.8.14.1, and instead stresses
the the responsibility of UA to memory management of media elements.

Regards,
Duan Yao.

Received on Tuesday, 19 August 2014 07:13:21 UTC