[whatwg] A standard for adaptive HTTP streaming for media resources

Hello all,

> I would like to raise an issue that has come up multiple times before,
> but hasn't ever really been addressed properly.

Silvia, thanks for mentioning this issue.

> We've in the past talked about how there is a need to adapt the
> bitrate version of a audio or video resource that is being delivered
> to a user agent based on the available bandwidth on the network, the
> available CPU cycles, and possibly other conditions.

Indeed, one such key condition is the current dimensions of the video window. Tracking this condition allows user-agents to:

*) Not waste bandwidth, e.g. by pushing a 720p video in a 320x180 video tag. 
*) Respond to changes in the video display, e.g. when the video is switched to fullscreen playback.

> It has been discussed to do this using @media queries and providing
> links to alternative versions of a media resources through the
> <source> element inside it. But this is a very inflexible solution,
> since the side conditions for choosing a bitrate version may change
> over time and what is good at the beginning of video playback may not
> be good 2 minutes later (in particular if you're on a mobile device
> driving through town).

Providing the different media options using <source> elements might still work out fine, if there's a clearly defined API that covers all scenarios. A rough example:

<video>
  <source bitrate="100" height="120" src="video_100.mp4" type="video/mp4; codecs='avc1.42E01E, mp4a.40.2'; keyframe-interval='00:02'" width="160">
  <source bitrate="500" height="240" src="video_500.mp4" type="video/mp4; codecs='avc1.42E01E, mp4a.40.2'; keyframe-interval ='00:02'" width="320">
  <source bitrate="900" height="540" src="video_900.mp4" type="video/mp4; codecs='avc1.42E01E, mp4a.40.2'; keyframe-interval ='00:02'" width="720"> 
</video>

This example would tell the user-agent that the three MP4 files have a keyframe-interval of 2 seconds - which of course raises the issue that fixed keyframe-intervals would be required.

The user-agent can subsequently use e.g. the Media Fragments API to request chunks, switching between sources as the conditions change.

> Further, we have discussed the need for supporting a live streaming
> approach such as RTP/RTSP - but RTP/RTSP has its own "non-Web" issues
> that will make it difficult to make it part of a Web application
> framework - in particular it request a custom server and won't just
> work with a HTTP server.
> 
> In recent times, vendors have indeed started moving away from custom
> protocols and custom servers and have moved towards more intelligence
> in the UA and special approaches to streaming over HTTP.
> 
> Microsoft developed "Smooth Streaming" [1], Apple developed "HTTP Live
> Streaming" [2] and Adobe recently launched "HTTP Dynamic Streaming"
> [3]. (Also see a comparison at [4]). As these vendors are working on
> it for MPEG files, so are some people for Ogg. I'm not aware anyone is
> looking at it for WebM yet.

Apparently, there are already working setups:

http://www.flumotion.com/demosite/webm/

> Standards bodies haven't held back either. The 3GPP organisation have
> defined 3GPP adaptive HTTP Streaming (AHS) in their March 2010 release
> 9 of  3GPP [5]. Now, MPEG has started consolidating approaches for
> adaptive bitrate streaming over HTTP for MPEG file formats [6].
> 
> Adaptive bitrate streaming over HTTP is the correct approach towards
> solving the double issues of adapting to dynamic bandwidth
> availability, and of providing a live streaming approach that is
> reliable.

I would also add the use cases of adapting to screen estate (fullscreen) and decoding power (netbooks, phones). 

Additionally, adaptive bitrate streaming is a great approach for delivering long-form content (>10 minutes). It provides the means to simultaneously decrease metadata loading times and decrease the amount of content delivered to the user-agent that might not get watched (downloading a 10min. video while only 20s will get watched).

> Right now, no standard exists that has been proven to work in a
> format-independent way. This is particularly an issue for HTML5, where
> we want at least support for MPEG4, Ogg Theora/Vorbis, and WebM.

One might consider Apple's MPEG-TS approach as well,though it adds yet another container. 

I wonder why Apple did not choose MP4 fragments for their Live HTTP Streaming?

> I know that it is not difficult to solve this issue in a
> format-independent way, which is why solutions are jumping up
> everywhere. They are, however, not compatible and create a messy
> environment where people have to install solutions for multiple
> different approaches to make sure they are covered for different
> platforms, different devices, and different formats. It's a clear
> situation where a new standard is necessary.
> 
> The standard basically needs to provide three different things:
> * authoring of content in a specific way

> * description of the alternative files on the server and their
> features for the UA to download and use for switching
> * a means to easily switch mid-way between these alternative files
> 
> I am personally not sure which is the right forum to create the new
> standard in, but I know that we have a need for it in HTML5.

Agreed. 

By its current spec, HTML5 video is mostly suited for display of short clips. 

High-quality, long-form and live content need an additional level of functionality, which HTTP Streaming seems to provide.

> Would it be possible / the right way to start something like this as
> part of the Web applications work at WHATWG?
> (Incidentally, I've brought this up in W3C before an not got any
> replies, so I'm not sure W3C would be a better place for this work.
> Maybe IETF? But then, why not here...)
> 
> What do people think?
> 
> Cheers,
> Silvia.

Kind regards,

Jeroen Wijering

Received on Friday, 28 May 2010 04:57:55 UTC