Re: [media] issue-152: documents for further discussion from Silvia Pfeiffer on 2011-04-20 (public-html-a11y@w3.org from April 2011)

From: Silvia Pfeiffer <silviapfeiffer1@gmail.com>
Date: Wed, 20 Apr 2011 12:19:19 +1000
To: Ian Hickson <ian@hixie.ch>
Cc: Philip Jägenstedt <philipj@opera.com>, "public-html-a11y@w3.org" <public-html-a11y@w3.org>
Message-ID: <BANLkTikswO7tzuz2MFxnLL-ZEQGb3XqB+A@mail.gmail.com>
On Wed, Apr 20, 2011 at 10:51 AM, Ian Hickson <ian@hixie.ch> wrote:
> On Tue, 12 Apr 2011, Silvia Pfeiffer wrote:
>> > > the TrackList only includes name and language attributes - in
>> > > analogy to TextTrack it should probably rather include (name, label,
>> > > language, kind)
>> >
>> > I'm fine with exposing more data, but I don't know what data in-band
>> > tracks typically have. What do in-band tracks in popular video formats
>> > expose? Is there any documentation on this?
>>
>> There is a discussion on the main list about metadata right now and I
>> have posted a link there about what the W3C Media Annotations WG 's
>> analysis of media formats found as typically used metadata on audio and
>> video. If you want to understand what is generally available, that is a
>> good starting point, see http://www.w3.org/TR/mediaont-10/ .
>
> Woah, that's a lot of data. I guess a better approach for this will be to
> look at use cases and figure out what needs exposing.

Yeah, I agree. And by no means am I suggesting to adopt all of them,
or even to adopt the complex structure that the WG came up with. I
look at it as an interesting analysis in what is available.


>> I would, however, regard these two attributes that we discussed here as
>> a separate issue, because if somebody wants to create custom controls
>> and e.g. provide all the alternative video descriptions in one menu,
>> they would want all the text descriptions and audio descriptions listed
>> - similarly if they want all the alternative captions in one menu, they
>> would want all the text track captions as well as all the videos that
>> are created from bitmaps as overlay captions as well as all the
>> alternative video tracks with burnt-in captions. So, providing a label
>> (for use in the menu) and a kind (for classification) is very useful.
>> These can all be mapped from fields from within video formats.
>
> I assume you're talking primarily about "kind" here. "name" and "label"
> are the same thing (actually I've renamed "name" to "label" to improve
> consistency with other parts of the platform).

Yes, I don't mind if we call it "name" or "label" - I do prefer label
to be consistent with the Text Track.

I do think we need an additional "id" or similar, which is unique and
can be used for fragment addressing. (See the other thread).


> Looking at the metadata list cited above, I don't see anything in either
> ogg, mp4, or webm that maps to "kind", so I don't see much point exposing
> that on the audio/video track lists, though I agree that in principle it
> would be a good idea.

Let's be clear: the media annotations list is not complete for each
one of the formats. It is trying to identify a subset that will work
across many formats.

Also, they actually have a "role" attribute on the "fragment" which
they suggest using for identifying the "kind" of a "track":
http://www.w3.org/TR/mediaont-10/#example4 . So, it is indeed there.
Hmm.. given this, I should probably change what is written for "OGG"
under "fragment", because certainly Ogg has fields that provide the
kind of a track. WebM and MP4 have them, too.


> Realistically though, for in-band tracks it's more likely that that data
> will be provided to the script out-of-band so that it can construct the UI
> before the movie loads, and for out-of-band tracks the information can be
> made available in the markup (e.g. using data-* attributes). For UA-driven
> menus, the title is probably sufficient for most purposes, and that can
> already be made available.

The biggest issue with this approach is discoverability. A Web
developer that has to deal with multiple resources for which he
doesn't a-priori know what kinds of tracks they have available in-band
has no chance to find this out through script if there is no interface
that exposes this information. It would need to be done server-side.


>> > Note that for media-element-level data, you can already use data-*
>> > attributes to get anything you want, so the out-of-band case is
>> > already fully handled as far as I can tell.
>>
>> Interesting. In the audio description case, would a label, kind, and
>> language be added to the menu of the related video element?
>
> For scripted UIs, that's up to the script.
>
> For UA UIs, it depends if we are talking about multiple video tracks or
> multiple audio tracks. Multiple video tracks aren't handled, because
> there's no sane way to have the UA turn the video tracks on and off. For
> the audio case, I don't really see much reason to expose more than a
> title. A kind could be used but it's going to be used so rarely that in
> practice the UA will want to handle the case of only having a title
> anyway, and once you support that, it's not clear what a kind would really
> do to make things better.
>
> It's something we can always provide in the future though, if it turns out
> to be more common than one would guess from looking at content today.

I think it will be a problem with the first implementation of this,
since we would want to add the information to the menu for audio
tracks just like for text tracks and the text tracks have this
information (kind, label, language).

I guess we can wait till then, though, since it doesn't change
anything substantial about the way in which thing work.



>> > | a group should be able to loop over the full multitrack rather than a
>> > | single slave
>> >
>> > Not sure what this means.
>>
>> We discussed the looping behaviour. To make it symmetrical with in-band
>> multitrack resources, it would make sense to be able to loop over
>> composed multitrack resources, too. The expected looping behaviour is
>> that a loop on the composed resource loops over the composite as a
>> whole. So, the question is then how to turn such looping on.
>>
>> The proposal is that when one media element in the group has a @loop
>> attribute, that would turn the looping on the composite resource on.
>> This means that when the loop is set and the end of the composite
>> resource is reached (its duration), the currentTime would be reset to
>> its beginning and playback of the composite resource would start again.
>> Looping on individual elements is turned off and only the composite
>> resource can loop.
>
> What's the use case?

The same as for the loop attributes on a audio or video element. It's
a media resource and should work consistently to how other media
resources are handled.

E.g. if I have a plugin that likes to turn all media elements to
looping for whatever reason (entertain the kids? ;-), I can do that
for normal media elements and for in-band multitrack consistently with
the loop attribute, but I have to make an exception for composed
multitrack, because it doesn't allow for the handling of a loop
attribute. (stupid example, I know: so pick something with music..)



>> > | some attributes of HTMLMediaElement are missing in the MediaController
>> > | that might make sense to collect state from the slaves: error,
>> >
>> > Errors only occur as part of loading, which is a per-media-element
>> > issue, so I don't really know what it would mean for the controller to
>> > have it.
>>
>> The MediaController is generally regarded as the state keeper for the
>> composite resource.
>
> It is? That's certainly not how it's defined. It's just a central
> controller, it doesn't keep any of the state for the resources.

Not for the individual ones, but for the combined construct. E.g. you
can ask it for what the currentTime or the combined construct is, etc.


>> So, what happens when a single slave goes into error state. Does the
>> full composite resource go into error state? Or does it ignore the slave
>> - turn it off, and continue?
>
> Media elements don't really have an error state. They have a networkState
> and a readyState, which affect the MediaController, but the 'error'
> attribute is just for exposing the last error for events, it's not part of
> the state machine.

That still doesn't answer the question: what happens if one of the
slaves happens to have a network error and cannot continue playing,
because it runs out of data. Does the combined resource stall?
Forever? Is there a way for script to identify this and remove the
stalling slave from the group? Maybe we need an onerror event on the
MediaController, which will be raised if one of the slaves has an
error fetching the media data. Then the script developer can go
through the list of slaves in the one callback and remove the
contender.


>> > | readyState
>> >
>> > I could expose a readyState that returns the lowest value of all the
>> > readyState values of the slaved media elements, would that be useful?
>> > It would be helpful to see a sample script that would make use of
>> > this; I don't really understand why someone would care about doing
>> > this at the controller level rather than the individual track level.
>>
>> I think it makes sense, in particular when script is waiting for all
>> elements to go to HAVE_METADATA state, which is often the case when you
>> are trying to do something on the media resource, but have to wait until
>> it's actually available.
>>
>> An example JS would be where you are running your own controls for the
>> combined resource and want to determine the combined duration and volume
>> for visual display, e.g.
>>
>>       video.controller.addEventListener("loadedmetadata", init, false);
>>       function init(evt) {
>>         duration.innerHTML = video.controller.duration.toFixed(2);
>>         vol.innerHTML      = video.controller.volume.toFixed(2);
>>       }
>>
>> So, I think a combined readyState makes sense in the way you described.
>
> That example doesn't use readyState at all. Is there a use case for
> readyState specifically?


We actually discussed in the last call whether we only need the events
or also readyState. Eric had an example where you would raise an
event, but only do something if the element is only in a particular
readyState at the time of processing. I don't remember exactly what it
was. My position is that we really need the events, but I could live
without having a combined readyState.


>> > | (this one is particularly important for onmetadatavailable events)
>> >
>> > The events are independent of the attributes. What events would you
>> > want on a MediaController, and why? Again, sample code would really
>> > help clarify the use cases you have in mind.
>>
>> Maybe a onmetadatavailable event is more useful than a readyState then?
>
> I've updated the spec to fire a number of events on MediaController,
> including 'metadataavailable' and 'playing'/'waiting'.

Ah excellent. That's great.


>> I am not aware of many scripts that use the readyState values directly
>> for anything, even on the media elements themselves.
>
> One example of readyState usage on a media element would be a 'waiting'
> event handler that checks whether readyState is HAVE_CURRENT_DATA or
> HAVE_METADATA and uses that information to decide whether to display a
> poster frame overlay or not.
>
>
>> > | TimeRanges played
>> >
>> > Would this return the union or the intersection of the slaves'?
>>
>> That would probably be the union, because those parts of the timeline
>> are what the user has viewed, so he/she would expect them to be marked
>> in manually created controls.
>
> Ok, added.
>
>
>> > | ended
>> >
>> > Since tracks can vary in length, this doesn't make much sense at the
>> > media controller level. You can tell if you're at the end by looking
>> > at currentTime and duration, but with infinite streams and no
>> > buffering the underlying slaves might keep moving things along
>> > (actively playing) with currentTime and duration both equal to zero
>> > the whole time. So I'm not sure how to really expose 'ended' on the
>> > media controller.
>>
>> "ended" on the individual elements (in the absence of loop) returns true when
>>
>> Either:
>>
>>     The current playback position is the end of the media resource, and
>>     The direction of playback is forwards.
>>
>> Or:
>>
>>     The current playback position is the earliest possible position, and
>>     The direction of playback is backwards.
>
> No, 'ended' only fires when going forwards.


I quoted from the spec:
http://www.whatwg.org/specs/web-apps/current-work/multipage/video.html#ended-playback


>> So, in analogy, for the composed resource: it would return the union of
>> the ended result on all individual elements, namely "ended" only when
>> all of them are in ended state.
>
> But what's the use case?

If I reach the end, I want to present something different, such as a
post-roll add or an overlay with links to other videos that are
related. It is much easier to wait on a onended event on the combined
resource than having to register an event handler with each slave and
then try and combine the result.


>> > | and autoplay.
>> >
>> > How would this work? Autoplay doesn't really make sense as an IDL
>> > attribute, it's the content attribute that matters. And we already have
>> > that set up to work with media controllers.
>>
>> As with @loop, it would be possible to say that when one media element
>> in the union has @autoplay set, then the combined resource is in
>> autoplay state.
>
> I don't understand the use case for exposing this as an IDL attribute on
> the controller.

Same use case as for any other media element - and then to have it
consistent, so that we can use the same code to deal with grouped
media elements as with in-band multitrack elements. For @autoplay it
even has an accessibility use case: it's possible for a UA or plugin
to provide settings to stop autoplay on media elements with the
@autoplay IDL attribute. It's not easily possible to stop hand-coded
play() calls, which would be the only way for grouped multitrack media
in the way in which it is currently specified.



>> One more question turned up today: is there any means in which we could
>> possibly create @controls (with track menu and all) for the combined
>> resource? Maybe they could be the same controls on all the elements that
>> have a @controls active, but would actually be driven by the
>> controller's state rather than the element's? Maybe the first video
>> element that has a @controls attribute would get the full controller's
>> state represented in the controls? Could there be any way to make
>> @controls work?
>
> The UA is responsible for this, but the spec requires that the UI
> displayed for a control that has a controller control the controller.

That's good. Question of clarification: Does that mean that for all
elements in a group that display controls these controls actually
control the controller? Or do only the controls of the media element
that created the controller control the controller?
(hope that makes sense to you ;-)


Cheers,
Silvia.
Received on Wednesday, 20 April 2011 02:20:06 UTC