Re: Tech Discussions on the Multitrack Media (issue-152)

The replication of the API is indeed the worry about introducing
elements that are very similar. I don't have an answer for how to
solve this best.

Silvia.

On Fri, Feb 18, 2011 at 9:25 AM, Bob Lund <B.Lund@cablelabs.com> wrote:
> If the <audiotrack> and <videotrack> provide a similar API to <audio> and <video> respectively, especially in the case of <videotrack>, then I agree the wiki alternative 3 is the same as what I proposed, without the recursion problem you noted.
>
> Regards,
> Bob
>
> -----Original Message-----
> From: Silvia Pfeiffer [mailto:silviapfeiffer1@gmail.com]
> Sent: Thursday, February 17, 2011 3:12 PM
> To: Bob Lund
> Cc: public-html@w3.org
> Subject: Re: Tech Discussions on the Multitrack Media (issue-152)
>
> In this markup we have to be very careful not to allow further recursive use of the <video> and <audio> elements, since the semantic of the outer <video> is now different to the semantics of the inner <audio> and <video> elements. The outer <video> element now only has the role of a container and the inner ones have the role of providing audio-only and video-only tracks. This is why the wiki introduced new names for the inner ones: <audiotrack> and <videotrack> just to make it plain obvious that they are different semantically and cannot be further recursively repeated.
>
> As for the displays, which is a challenge for multitrack video - we should probably experiment with a number of different potential ways of displaying multiple video tracks at the same time. I get the feeling that there are only a limited number of ways in which they can be displayed and we could possibly cover that with CSS? We also have to consider that we don't always have all the real estate of a Web page available, because when the video goes full-screen it is no different to a TV display. So, getting inspiration from existing ways in which multiple video tracks are displayed would be a good idea. To help with this, I have started a wiki page at http://www.w3.org/WAI/PF/HTML/wiki/Media_Multitrack_Media_Rendering to collect such image and compare and start structuring the possibilities (I hope that location is ok for everyone). If you want to contribute images but cannot upload there directly, feel free to email me the image and I will upload it for you.
>
> Cheers,
> Silvia.
>
>
> On Fri, Feb 18, 2011 at 5:06 AM, Bob Lund <B.Lund@cablelabs.com> wrote:
>> I have several comments on the proposal alternatives on the wiki
>> (which have been very informative). As a first poster let me introduce
>> myself - I represent CableLabs, where we've been analyzing commercial
>> video service provider requirements and how HTML5, and timed text
>> tracks and multimedia tracks, can be used to meet those requirements.
>>
>>
>>
>> Overloading the existing track element representing Timed Text Tracks
>> for media tracks would mix two fundamentally different models. Timed
>> Text Tracks have cues with substantially different semantics than
>> continuous media tracks. Side condition 8 notes this. I think it's a
>> good idea to keep Timed Text Tracks separate from continuous audio and
>> video tracks. This would seem to rule out 1), 2) and 7).
>>
>>
>>
>> We've been experimenting with using @kind=metadata timed text tracks
>> for a variety of applications and it would be helpful to be able to
>> distinguish between different @kind=metadata types. This is important
>> to keep the in-band and out-of-band markup the same. Having a <track>
>> @type attribute would permit this.
>>
>>
>>
>> The additional vs alternate semantics for media tracks is interesting.
>> It seems that more than one video track implies that the second video
>> track is playing in addition to the primary. Each should be in a
>> separate window, but only one set of controls (associated with the
>> primary video). How the two windows are displayed should be up to the
>> application because, in general, the user agent won't have enough
>> information, e.g. should the signing be superimposed in the bottom
>> right corner of the primary vs off screen, what size, etc (it might be
>> possible for the user agent to be told how to position multiple video windows but a general a solution to this is TBD).
>> Audio might be merged by the user agent into a single stream, or an
>> alternate audio track might replace the primary audio - Spanish vs
>> English track for example.
>>
>>
>>
>> Here's an alternative merging the containment model alternative 3 and
>> alternative 6 with its application access to the audio/media objects
>> that supports the above use cases:
>>
>>
>>
>> <video id="v1" poster="video.png" controls>
>>
>>            <source src="video.webm" type="video/webm"> <!-- primary
>> content
>> -->
>>
>>            <source src="video.mp4" type="video/mp4"> <!-- primary
>> content
>> -->
>>
>>            <track kind="captions" srclang="en" src="captions.vtt">
>>
>>
>>
>> <audio kind="descriptions" srclang="en"> <!-- pre-recorded audio
>> descriptions -->
>>
>>                         <source src="description.ogg" type="audio/ogg"
>> label="English Audio Description">
>>
>>                         <source src="description.mp3"
>> type="audio/mp3">
>>
>>            </audio>
>>
>>
>>
>> <audio kind="alternate" srclang="sp"> <!- Spanish alternative audio
>> -->
>>
>>                         <source src="spaudio.ogg" type="audio/ogg"
>> label="Spanish audio">
>>
>>                         <source src=" spaudio.mp3" type="audio/mp3">
>>
>>            </audio>
>>
>>
>>
>> <audio kind="descriptions" srclang="sp"> <!-- pre-recorded audio
>> descriptions in Spanish-->
>>
>>                         <source src="spdescription.ogg" type="audio/ogg"
>> label="Spanish Audio Description">
>>
>>                         <source src="spdescription.mp3"
>> type="audio/mp3">
>>
>>             </audio>
>>
>>
>>
>> <video kind="signings" srclang="asl" label="American Sign Language">
>> <!-- sign language overlay -->
>>
>>                         <source src="signing.webm" type="video/webm">
>>
>>                         <source src="signing.mp4" type="video/mp4">
>>
>>            </video>
>>
>>
>>
>> <video kind="alternate" label="Alternate Camera 1">
>>
>>                         <source src="alternate-camera-1.webm"
>> type="video/webm">
>>
>>                         <source src="alternate-camera-1.mp4"
>> type="video/mp4">
>>
>>            </video>
>>
>>
>>
>> </video>
>>
>>
>>
>> English audio descriptions would be enabled like this:
>>
>>
>>
>> for (i in video.audio) {
>>
>> if (video.audio[i].kind == "descriptions" && video.audio[i].language
>> ==
>> "en") {
>>
>> video.audio[i].mode = SHOWING;
>>
>> break;
>>
>> }
>>
>> }
>>
>>
>>
>> Spanish audio track with audio descriptions would be enabled like this:
>>
>>
>>
>> video.muted=TRUE;
>>
>>
>>
>> for (i in video.audio) {
>>
>> if (video.audio[i].kind == "alternate" && video.audio[i].language ==
>> "sp") {
>>
>> video.audio[i].mode = SHOWING;
>>
>> break;
>>
>> }
>>
>> }
>>
>>
>>
>> for (i in video.audio) {
>>
>> if (video.audio[i].kind == "descriptions" && video.audio[i].language
>> ==
>> "sp") {
>>
>> video.audio[i].mode = SHOWING;
>>
>> break;
>>
>> }
>>
>> }
>>
>>
>>
>> Last, many existing commercial video providers will offer live
>> streaming services. We expect that the presence of timed text tracks,
>> and alternate audio and video tracks will vary over time depending on
>> the content in the stream. Therefore, the tracks will be discovered
>> in-band and can also be expected to disappear.
>>
>>
>>
>> Thanks,
>>
>> Bob Lund
>>
>>
>

Received on Thursday, 17 February 2011 22:40:34 UTC