Re: [MEDIA_PIPELINE_TF] Adaptive Bit Rate Architectur from Mays, David on 2011-12-12 (public-web-and-tv@w3.org from December 2011)

From: Mays, David <David_Mays@Comcast.com>
Date: Mon, 12 Dec 2011 18:12:20 +0000
To: "public-web-and-tv@w3.org WG" <public-web-and-tv@w3.org>
Message-ID: <CB0BA9C1.B46D%David_Mays@Comcast.com>

My comments are within.

On 12/12/11 12:19 PM, "Mark Watson" <watsonm@netflix.com<mailto:watsonm@netflix.com>> wrote:

On Dec 12, 2011, at 7:58 AM, Mays, David wrote:

2. The input value here could vary depending on the underlying
representation of "quality level" within the manifest or playlist.
- In the case of HLS, each quality level is represented by BANDWIDTH=nnnnnn
- In Smooth Streaming, each <QualityLevel> element contains a Bitrate
attribute
- In MPEG-DASH, each <Representation> element has a bandwidth attribute
So for three major forms of adaptive streaming, it seems as though a
direct reference to the "bitrate" or "bandwidth" number makes sense here
as the input value.

It's meaningless to describe bitrate or bandwidth with a single number, unless you are talking about a very long-term average. Bandwidth is not like velocity, which varies continuously and therefore has a meaningful value at a single point in time: you need to specify what kind of average you are talking about (for example, over what time period).

In the use cases I think we're talking about here, the bandwidth/bitrate number is a proxy for something like "quality level identifier". As you suggest below, it would be great to have the multiple providers of adaptive schemes agree to provide an identifier. Dash has one of course, and Smooth Streaming has an "index" attribute that could be used similarly.

The attributes in HLS, Smooth Streaming and DASH likely have different definitions.

We need to take a step back here and understand the requirements: why do we want scripts to have control of these things and what kind of control to we want them to have ?

Much like the other documents, I tend to agree that a use-case-based approach can help drive the requirements in a better direction. So here's a stab at what I think some of the use cases are:

1. An application developer would like to enable an end-user to select only an "HD" level (or other high quality) stream.
2. An application developer would like to enable an end-user to select a lower-quality stream to compensate for "hunting" in the heuristics.
3. An application developer would like to enable an end-user to limit their overall bandwidth usage to remain under a provider-imposed cap.
4. An application developer would like to experiment with different adaptive heuristics.
5. An application developer would like to report on CDN health/performance by getting playback statistics from widely deployed players.
6. An application developer would like to provide appropriate responses to error conditions specific to adaptive playback.

I have added these to the wiki for discussion/editing.

For example, if the requirement is to allow the service provider and/or user to limit bandwidth usage (because the user has a bandwidth cap or usage-based-charging), then the parameter should be a long-term limit on bandwidth used and what the client does to stick within that is up to the client implementation.

On the other hand, if the requirement is to allow service providers to experiment and differentiate with different adaptive streaming algorithms, we need to demonstrate that the proposed extensions actually allow that. In particular, the ability to set the stream choice from based only on a report of the incoming throughput enables experimentation with approximately one algorithm (i.e. it doesn't really allow any interesting experimentation.

In my view, to meet the second requirement means that, at a minimum, the script needs to have
(1) fine-grained visibility of the network performance: bytes received at a small reporting granularity <500ms, TCP connect times, HTTP response times
(2) visibility of the available streams and their bitrate profile over time (VBR profile)
(3) visibility of the multiple possible locations at which each stream is available
(4) visibility of client buffers: received data, requested but not received data
(5) notification of the appropriate decision points (in time)
(5) control of the choice of stream and of choice of download location at those decision points

Constructing a Javascript API for this is complex. I think a more realistic approach to providing this possibility for experimentation and differentiation is approach 3.

I will note that in DASH, because of the extra complexity/flexibility of
the model, different "periods" within a piece of media may have different
"adaptation sets" each having different lists of available bandwidths, the
programming model we are discussing here could have some issues if, for
example, a period within a DASH media had a higher minimum bandwidth than
what the application specified, based on the initial set of available
quality levels.
I would say this could even happen with other forms of adaptive streaming,
in a "live linear" scenario where a broadcaster decided during part of a
stream to downgrade the overall quality of the stream (e.g. For low-cost
advertising, etc) and the highest available quality level was lower than
the minimum specified by the application.
Overall that seems like an application concern, though it might make sense
to have some form of notification to an application that the set of
quality levels has changed.
Would there be an error case here if the either the minLevel or maxLevel
that was set was "out of bounds"?
3. I assume you are using the DASH definition of Representation here. It
seems like there are a handful of useful notifications that are not in
this proposal yet:
These are based on definitions from MPEG-DASH, but could be mapped onto
similar concepts in other streaming schemes.
* RepresentationChanged - The adaptive heuristic algorithm has caused the
player to switch to a different representation.

This is definitely needed in approaches 1 and 2 for reporting. The event should be based on the actual change of rendered Representation, not on the decision.

Agreed.

Rather than use bitrate/bandwidth as a tag for the Representation, we should ask each adaptive streaming system to nominate an appropriate "id" field. DASH has one. For HLS one might use the playlist name. I think this makes it clear that the naming scheme is system-specific and avoids confusion about the meaning of bitrate/bandwidth fields.

* AdaptationSetChanged - The media has entered a different period where
the set of Representations may differ.

This might be better called PeriodChanged.

I chose the name AdaptationSetChanged because the thing of interest is actually the fact that there is a different set of video quality levels now. The specific name PeriodChanged only makes sense within the narrow framework of DASH. I understand your point, and AdaptationSetChanged isn't perfect, but let's not bikeshed too much on the naming just yet; not until we've nailed down the use cases anyway. ;-)

Thanks,
Dave

Received on Monday, 12 December 2011 18:28:27 UTC