Re: Tech Discussions on the Multitrack Media (issue-152)

In this markup we have to be very careful not to allow further
recursive use of the <video> and <audio> elements, since the semantic
of the outer <video> is now different to the semantics of the inner
<audio> and <video> elements. The outer <video> element now only has
the role of a container and the inner ones have the role of providing
audio-only and video-only tracks. This is why the wiki introduced new
names for the inner ones: <audiotrack> and <videotrack> just to make
it plain obvious that they are different semantically and cannot be
further recursively repeated.

As for the displays, which is a challenge for multitrack video - we
should probably experiment with a number of different potential ways
of displaying multiple video tracks at the same time. I get the
feeling that there are only a limited number of ways in which they can
be displayed and we could possibly cover that with CSS? We also have
to consider that we don't always have all the real estate of a Web
page available, because when the video goes full-screen it is no
different to a TV display. So, getting inspiration from existing ways
in which multiple video tracks are displayed would be a good idea. To
help with this, I have started a wiki page at
http://www.w3.org/WAI/PF/HTML/wiki/Media_Multitrack_Media_Rendering to
collect such image and compare and start structuring the possibilities
(I hope that location is ok for everyone). If you want to contribute
images but cannot upload there directly, feel free to email me the
image and I will upload it for you.

Cheers,
Silvia.


On Fri, Feb 18, 2011 at 5:06 AM, Bob Lund <B.Lund@cablelabs.com> wrote:
> I have several comments on the proposal alternatives on the wiki (which have
> been very informative). As a first poster let me introduce myself - I
> represent CableLabs, where we’ve been analyzing commercial video service
> provider requirements and how HTML5, and timed text tracks and multimedia
> tracks, can be used to meet those requirements.
>
>
>
> Overloading the existing track element representing Timed Text Tracks for
> media tracks would mix two fundamentally different models. Timed Text Tracks
> have cues with substantially different semantics than continuous media
> tracks. Side condition 8 notes this. I think it’s a good idea to keep Timed
> Text Tracks separate from continuous audio and video tracks. This would seem
> to rule out 1), 2) and 7).
>
>
>
> We’ve been experimenting with using @kind=metadata timed text tracks for a
> variety of applications and it would be helpful to be able to distinguish
> between different @kind=metadata types. This is important to keep the
> in-band and out-of-band markup the same. Having a <track> @type attribute
> would permit this.
>
>
>
> The additional vs alternate semantics for media tracks is interesting. It
> seems that more than one video track implies that the second video track is
> playing in addition to the primary. Each should be in a separate window, but
> only one set of controls (associated with the primary video). How the two
> windows are displayed should be up to the application because, in general,
> the user agent won’t have enough information, e.g. should the signing be
> superimposed in the bottom right corner of the primary vs off screen, what
> size, etc (it might be possible for the user agent to be told how to
> position multiple video windows but a general a solution to this is TBD).
> Audio might be merged by the user agent into a single stream, or an
> alternate audio track might replace the primary audio – Spanish vs English
> track for example.
>
>
>
> Here’s an alternative merging the containment model alternative 3 and
> alternative 6 with its application access to the audio/media objects that
> supports the above use cases:
>
>
>
> <video id="v1" poster=“video.png” controls>
>
>            <source src=“video.webm” type=”video/webm”> <!-- primary content
> -->
>
>            <source src=“video.mp4” type=”video/mp4”> <!-- primary content
> -->
>
>            <track kind=”captions” srclang=”en” src=”captions.vtt”>
>
>
>
> <audio kind=”descriptions” srclang=”en”> <!-- pre-recorded audio
> descriptions -->
>
>                         <source src=”description.ogg” type=”audio/ogg”
> label="English Audio Description">
>
>                         <source src=”description.mp3” type=”audio/mp3”>
>
>            </audio>
>
>
>
> <audio kind=”alternate” srclang=”sp”> <!— Spanish alternative audio -->
>
>                         <source src=”spaudio.ogg” type=”audio/ogg”
> label="Spanish audio">
>
>                         <source src=” spaudio.mp3” type=”audio/mp3”>
>
>            </audio>
>
>
>
> <audio kind=”descriptions” srclang=”sp”> <!-- pre-recorded audio
> descriptions in Spanish-->
>
>                         <source src=”spdescription.ogg” type=”audio/ogg”
> label="Spanish Audio Description">
>
>                         <source src=”spdescription.mp3” type=”audio/mp3”>
>
>             </audio>
>
>
>
> <video kind="signings" srclang="asl" label="American Sign Language">  <!--
> sign language overlay -->
>
>                         <source src="signing.webm" type="video/webm">
>
>                         <source src="signing.mp4" type="video/mp4">
>
>            </video>
>
>
>
> <video kind="alternate" label="Alternate Camera 1">
>
>                         <source src="alternate-camera-1.webm"
> type="video/webm">
>
>                         <source src="alternate-camera-1.mp4"
> type="video/mp4">
>
>            </video>
>
>
>
> </video>
>
>
>
> English audio descriptions would be enabled like this:
>
>
>
> for (i in video.audio) {
>
> if (video.audio[i].kind == "descriptions" && video.audio[i].language ==
> "en") {
>
> video.audio[i].mode = SHOWING;
>
> break;
>
> }
>
> }
>
>
>
> Spanish audio track with audio descriptions would be enabled like this:
>
>
>
> video.muted=TRUE;
>
>
>
> for (i in video.audio) {
>
> if (video.audio[i].kind == "alternate" && video.audio[i].language == "sp") {
>
> video.audio[i].mode = SHOWING;
>
> break;
>
> }
>
> }
>
>
>
> for (i in video.audio) {
>
> if (video.audio[i].kind == "descriptions" && video.audio[i].language ==
> "sp") {
>
> video.audio[i].mode = SHOWING;
>
> break;
>
> }
>
> }
>
>
>
> Last, many existing commercial video providers will offer live streaming
> services. We expect that the presence of timed text tracks, and alternate
> audio and video tracks will vary over time depending on the content in the
> stream. Therefore, the tracks will be discovered in-band and can also be
> expected to disappear.
>
>
>
> Thanks,
>
> Bob Lund
>
>

Received on Thursday, 17 February 2011 22:13:20 UTC