RE: Change proposal for issue 152 from Frank Olivier on 2011-03-08 (public-html@w3.org from March 2011)

From: Frank Olivier <Frank.Olivier@microsoft.com>
Date: Tue, 8 Mar 2011 01:20:41 +0000
To: Philip Jägenstedt <philipj@opera.com>, "public-html@w3.org" <public-html@w3.org>
Message-ID: <91175A762AB48840AF1473514B26B47519F93946@TK5EX14MBXC102.redmond.corp.microsoft.>
Thanks for the feedback; We’re certainly open to the idea of adjusting the API in the CP. I know we've been discussing this in other mail threads, but I also want to do appoint-by-point reply to your comments.

"1. Only provides a solution for additional audio tracks, while ISSUE-152 calls is about "additional tracks of a multitrack audio/video resource"."
"2. Unlike the API for text tracks, doesn't provide a consistent API for additional audio/video tracks that are in-band and out-of-band."
We agree that a common API set would be a good approach.

"3. Only allows one audio track to be enabled at a time, making it unsuitable for audio descriptions voice-over which should be played in sync with the original audio track. I'm not sure how common this is in practice, but the alternative is to make a complete new audio mix with both the original audio and the voice-over."
We think the new audio mix (aka alternate track) approach makes most sense here; synchronizing various audio tracks seems to be overly complex/prone to quality-of-implementation issues, compared to creating/publishing an alternate track.

"4. audioTrackCount/audioTrackLanguage is inconsistent with TextTrack[] tracks where language information is in TextTrack.language."
"5. audioTrackCount is redundant with audioTrackLanguage.length"
"6. Enables/disabled tracks using currentAudioTrack rather than TextTrack.mode"
"7. If audio tracks are added or removed during playback, will currentAudioTrack unsigned long implicitly change with it or will the current track actually change?"
Good feedback; making this interface consistent with text track would be a good approach.

"8. Since only in-band tracks are handled, the solution requires muxing all audio tracks into a single resource, taking up bandwidth for all users, not just the ones who use the extra tracks."
Not necessary, since the actual media data can be stored in segments (depending on the media format on the server end).  It is up to the UA to stitch them together under the hood. In practice, in the simple single file case, the additional bandwidth is not likely to be prohibitive.

-----Original Message-----
From: public-html-request@w3.org [mailto:public-html-request@w3.org] On Behalf Of Philip Jägenstedt
Sent: Tuesday, February 22, 2011 2:46 AM
To: public-html@w3.org
Subject: Re: Change proposal for issue 152

On Tue, 22 Feb 2011 01:45:17 +0100, Frank Olivier <Frank.Olivier@microsoft.com> wrote:

> This is a change proposal for Issue 152, introducing a JavaScript API 
> for HTML5 media elements that allows Web authors to provide alternate 
> modes of presentation for a media presentation and allow selection 
> between them by the end user.
>
> It is a minimal extension to the existing API in that it does not 
> provide detailed access to the media tracks themselves, but merely 
> provides a means of indicating their presence and a means of selecting 
> between the presentation modes.

Before I go into criticism mode, it's great that you're working on this!

These issues are under active discussion, mostly in the "Tech Discussions on the Multitrack Media" thread, and it would be great to have Microsoft's feedback in that thread. Since the requirements and possible solutions are still so much in flux, I think it's rather premature to go through the decision process right now.

I see these problems with this CP:

1. Only provides a solution for additional audio tracks, while ISSUE-152 calls is about "additional tracks of a multitrack audio/video resource".

2. Unlike the API for text tracks, doesn't provide a consistent API for additional audio/video tracks that are in-band and out-of-band.

3. Only allows one audio track to be enabled at a time, making it unsuitable for audio descriptions voice-over which should be played in sync with the original audio track. I'm not sure how common this is in practice, but the alternative is to make a complete new audio mix with both the original audio and the voice-over.

4. audioTrackCount/audioTrackLanguage is inconsistent with TextTrack[] tracks where language information is in TextTrack.language.

5. audioTrackCount is redundant with audioTrackLanguage.length

6. Enables/disabled tracks using currentAudioTrack rather than TextTrack.mode

7. If audio tracks are added or removed during playback, will currentAudioTrack unsigned long implicitly change with it or will the current track actually change?

8. Since only in-band tracks are handled, the solution requires muxing all audio tracks into a single resource, taking up bandwidth for all users, not just the ones who use the extra tracks.

I'd also like to question the assertion that "This mechanism does not preclude a richer API being defined in the future". If we add the suggested API, we cannot later add another API that does the same thing, since the interactions between the APIs would become a mess unless the underlying model is the same. In this case it is clear that is not, in that this API assumes that the number of audio tracks is static and that only one can be active at a time, which isn't the case in an API like TextTrack.

I hope that the process will not be rushed on this, allowing time for experimentation with both the spec and implementations.

--
Philip Jägenstedt
Core Developer
Opera Software
Received on Tuesday, 8 March 2011 01:21:17 UTC