Re: HTML5 Last Call May 2011 & DASH/Adaptive Streaming

On Feb 15, 2011, at 1:37 PM, Bob Lund wrote:

There are several “features” that have been referenced that are really orthogonal to each other.

1)    Exposing alternate media and text tracks for “display”. The multi-track media API discussion would cover these in a media format independent way. As noted, a proposal is due by 2/21.
2)    Exposing timed text data (or metadata) to a JavaScript. This is done in HTML5 with “Timed Text Tracks”
3)    Various playback controls, e.g. fast forward, rewind, seek. The current <video> controls appear to be adequate in this regard
4)    Access to DASH (or other adaptive bit rate format) tracks. Here we need to distinguish between tracks of the same media at different bit rates and additional (to the media) tracks. The latter are addressed in 1) and 2) above. The former, access to the different bit-rates is not exposed by any adaptive bit-rate player that I’m aware of. The player makes the determination of the optimal playback rate and uses the URN in the manifest file to retrieve content at that rate. At least one service provider has noted it would be useful to have JavaScript control over the delivery bit-rate. This could be accomplished by exposing the different bit-rates as tracks via 1) above. The track could be labeled by bit-rate and no URN information would have to be exposed.

I'd strongly suggest not using the same "tracks" mechanism to control adaptivity.

The idea of Javascript control over adaptivity was also discussed on the FOMS list a while ago. In principle this would be great, enable experimentation, customization etc. but in practice it is much more complex than it at first seems.

If you take the naive approach, expose some measure of the current data arrival rate and the "bitrate" of each version then you are essentially constrained to one algorithm (compare incoming data rate with bitrate of each version) and you have very little scope for experimentation. This undermines the point of exposing the adaptivity in the first place.

Both the data arrival rate and the bitrate of each version are much more complex beasts than a single number - if you give a single number is it an average or a peak, over what kind of time window etc. The current state (amount of buffered data for each stream) is also important. Respecting internal buffer limits is important too. Bitrate switching decisions need to take into account the available switch points, for example to trade off the amount of data to be received to get to the next switch point vs the amount that would be discarded if I switched at the previous one.

Exposing all this information to Javascript would be complex and result in a lot of chatter. It may be viable with careful API design, but it is certainly a challenge.

Another approach would be to adopt the idea that there could be selectable adaptivity algorithms (someone likened this to different TCP Congestion Control algorithms) and a way to pass them algorithm-specific tuning parameters. This would also enable some level of experimentation and customization.

...Mark


The adaptive bit-rate delivery issue applies to all adaptive bit-rate formats, not just DASH. It would be good if was solved in a general way.

Regards,
Bob Lund
CableLabs
From: public-web-and-tv-request@w3.org<mailto:public-web-and-tv-request@w3.org> [mailto:public-web-and-tv-request@w3.org] On Behalf Of Mark Watson
Sent: Tuesday, February 15, 2011 11:21 AM
To: Jean-Claude Dufourd
Cc: Glenn Adams; Richard Maunder; public-web-and-tv@w3.org<mailto:public-web-and-tv@w3.org>
Subject: Re: HTML5 Last Call May 2011 & DASH/Adaptive Streaming

Discussion on handling multi-track media is already underway on both whatwg and HMLT5 lists. See for example Jeroen's post: http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2011-February/030454.html

I think the intention is to include a solution to this. The deadline for solution proposals is 21 February.

This would address choice of language, subtitles, views and accessibility tracks. One issue is that DASH has a flexible labeling scheme for track types based on URNs, wheras the assumption in the HTML discussions is to use a defined list of track types. Personally I think a reasonable resolution of this would be to define URNs for the HTML-specified types and leave it at that (rather than the opposite approach of persuading HTML to expose URN types from DASH.)

Regarding trick modes, I am not sure what is missing ? The HTML5 media element has a "playbackRate" attribute which can be used to play at different rates, forwards or backwards.

...Mark





On Feb 15, 2011, at 9:56 AM, Jean-Claude Dufourd wrote:


There is no question of including DASH technology in HTML5, just means to control DASHed media.
What some participants of the workshop defended was the inclusion of a way to deal, within HTML5, with various options offered by DASH, such as choice of bit-rate, audio, subtitles, as well as support for trick modes (a.k.a. VCR-like controls).
One possible solution is to add element/attribute syntax around the video object to allow that kind of control. Another solution is to add script APIs.
Best regards
JC

On 15/2/11 18:38 , Glenn Adams wrote:
Even if it were done today, I doubt very much they would reference it from the HTML5 spec. There just isn't a strong reason to do so. Besides, they have chosen a technology neutral position with respect to both stream media formats and transports.

Glenn Adams
On Tue, Feb 15, 2011 at 8:56 AM, Richard Maunder <rmaunder@cisco.com<mailto:rmaunder@cisco.com>> wrote:
Hi,

Interesting session in Berlin last week, thanks to all involved.

While we wait from the IG process & tools to form, I was interested in the implications of the HTML5 Last Call for May, especially the window for getting any DASH baseline or other adaptive streaming requirement into the spec:

http://www.w3.org/2011/02/htmlwg-pr.html

I'm not very familiar with the W3C processes, but my reading of them suggests it would be unlikely in this round if not in the spec by May?

Any thoughts on this?

Best wishes

Richard

Legal boilerplate follows.....
Any views or opinions expressed are solely those of the author and do not necessarily represent those of Cisco.






--

JC Dufourd

Directeur d'Etudes/Professor

Groupe Multimedia/Multimedia Group

Traitement du Signal et Images/Signal and Image Processing

Telecom ParisTech, 46 rue Barrault, 75 013 Paris, France

Tel: +33145817733 - Mob: +33677843843 - Fax: +33145817144

Received on Tuesday, 15 February 2011 22:08:06 UTC