- From: Bob Lund <B.Lund@CableLabs.com>
- Date: Wed, 20 Apr 2011 08:51:49 -0600
- To: Silvia Pfeiffer <silviapfeiffer1@gmail.com>, Mark Watson <watsonm@netflix.com>
- CC: Ian Hickson <ian@hixie.ch>, Philip Jägenstedt <philipj@opera.com>, "public-html-a11y@w3.org" <public-html-a11y@w3.org>
> -----Original Message----- > From: public-html-a11y-request@w3.org [mailto:public-html-a11y- > request@w3.org] On Behalf Of Silvia Pfeiffer > Sent: Tuesday, April 19, 2011 10:39 PM > To: Mark Watson > Cc: Ian Hickson; Philip Jägenstedt; public-html-a11y@w3.org > Subject: Re: [media] issue-152: documents for further discussion > > Mark, > > what is your list of track kinds for in-band tracks? > > I have thus far come up with the following: > > video: > * sign language video (in different sign languages) > * captions (as in: burnt-in video that may just be overlays) > * different camera angle * associated video track (which might be a generalization of different camera angle). One use case is video mosaic. > > audio: > * audio descriptions > * language dub > > Cheers, > Silvia. > > > On Wed, Apr 20, 2011 at 2:26 PM, Mark Watson <watsonm@netflix.com> > wrote: > > I'd like to second the requirement for an enumerated 'kind' for in- > band tracks. Considering adaptive streaming approaches, the information > about track kinds is much more likely to be available in-band than from > external metadata systems. > > > > Whilst the external metadata might contain information about what is > available, for presentation to the user (e.g. List of languages) it > doesn't make sense to require that layer to include information about > how this is mapped to any particular container or manifest format, when > multiple such versions of the content might be available (and might be > added or removed without changes to the user-visible metadata). > > > > A clean separation between UI and transport implies there is more than > > just natural-languages at the media transport layer: in fact it would > make more sense to have *only* the enumerated kind at the media > transport layer and leave the natural language aspects to the > presentation layer. > > > > Furthermore, enumerated kinds are necessary to make initial choices > based on user preferences - you have to be able to understand *what* the > tracks are. Also presentation of such tracks to the user might not be a > single menu of choices: there is structure such as language choices for > main audio and audio descriptions which mirror the language choices for > the corresponding subtitle tracks and this structure needs to be exposed > for a sensible UI that properly considers accessibility. > > > > ...Mark > > > > Sent from my iPhone > > > > On Apr 19, 2011, at 7:22 PM, "Silvia Pfeiffer" > <silviapfeiffer1@gmail.com> wrote: > > > >> On Wed, Apr 20, 2011 at 10:51 AM, Ian Hickson <ian@hixie.ch> wrote: > >>> On Tue, 12 Apr 2011, Silvia Pfeiffer wrote: > >>>>>> the TrackList only includes name and language attributes - in > >>>>>> analogy to TextTrack it should probably rather include (name, > >>>>>> label, language, kind) > >>>>> > >>>>> I'm fine with exposing more data, but I don't know what data > >>>>> in-band tracks typically have. What do in-band tracks in popular > >>>>> video formats expose? Is there any documentation on this? > >>>> > >>>> There is a discussion on the main list about metadata right now and > >>>> I have posted a link there about what the W3C Media Annotations WG > >>>> 's analysis of media formats found as typically used metadata on > >>>> audio and video. If you want to understand what is generally > >>>> available, that is a good starting point, see > http://www.w3.org/TR/mediaont-10/ . > >>> > >>> Woah, that's a lot of data. I guess a better approach for this will > >>> be to look at use cases and figure out what needs exposing. > >> > >> Yeah, I agree. And by no means am I suggesting to adopt all of them, > >> or even to adopt the complex structure that the WG came up with. I > >> look at it as an interesting analysis in what is available. > >> > >> > >>>> I would, however, regard these two attributes that we discussed > >>>> here as a separate issue, because if somebody wants to create > >>>> custom controls and e.g. provide all the alternative video > >>>> descriptions in one menu, they would want all the text descriptions > >>>> and audio descriptions listed > >>>> - similarly if they want all the alternative captions in one menu, > >>>> they would want all the text track captions as well as all the > >>>> videos that are created from bitmaps as overlay captions as well as > >>>> all the alternative video tracks with burnt-in captions. So, > >>>> providing a label (for use in the menu) and a kind (for > classification) is very useful. > >>>> These can all be mapped from fields from within video formats. > >>> > >>> I assume you're talking primarily about "kind" here. "name" and > "label" > >>> are the same thing (actually I've renamed "name" to "label" to > >>> improve consistency with other parts of the platform). > >> > >> Yes, I don't mind if we call it "name" or "label" - I do prefer label > >> to be consistent with the Text Track. > >> > >> I do think we need an additional "id" or similar, which is unique and > >> can be used for fragment addressing. (See the other thread). > >> > >> > >>> Looking at the metadata list cited above, I don't see anything in > >>> either ogg, mp4, or webm that maps to "kind", so I don't see much > >>> point exposing that on the audio/video track lists, though I agree > >>> that in principle it would be a good idea. > >> > >> Let's be clear: the media annotations list is not complete for each > >> one of the formats. It is trying to identify a subset that will work > >> across many formats. > >> > >> Also, they actually have a "role" attribute on the "fragment" which > >> they suggest using for identifying the "kind" of a "track": > >> http://www.w3.org/TR/mediaont-10/#example4 . So, it is indeed there. > >> Hmm.. given this, I should probably change what is written for "OGG" > >> under "fragment", because certainly Ogg has fields that provide the > >> kind of a track. WebM and MP4 have them, too. > >> > >> > >>> Realistically though, for in-band tracks it's more likely that that > >>> data will be provided to the script out-of-band so that it can > >>> construct the UI before the movie loads, and for out-of-band tracks > >>> the information can be made available in the markup (e.g. using > >>> data-* attributes). For UA-driven menus, the title is probably > >>> sufficient for most purposes, and that can already be made > available. > >> > >> The biggest issue with this approach is discoverability. A Web > >> developer that has to deal with multiple resources for which he > >> doesn't a-priori know what kinds of tracks they have available > >> in-band has no chance to find this out through script if there is no > >> interface that exposes this information. It would need to be done > server-side. > >> > >> > >>>>> Note that for media-element-level data, you can already use data-* > >>>>> attributes to get anything you want, so the out-of-band case is > >>>>> already fully handled as far as I can tell. > >>>> > >>>> Interesting. In the audio description case, would a label, kind, > >>>> and language be added to the menu of the related video element? > >>> > >>> For scripted UIs, that's up to the script. > >>> > >>> For UA UIs, it depends if we are talking about multiple video tracks > >>> or multiple audio tracks. Multiple video tracks aren't handled, > >>> because there's no sane way to have the UA turn the video tracks on > >>> and off. For the audio case, I don't really see much reason to > >>> expose more than a title. A kind could be used but it's going to be > >>> used so rarely that in practice the UA will want to handle the case > >>> of only having a title anyway, and once you support that, it's not > >>> clear what a kind would really do to make things better. > >>> > >>> It's something we can always provide in the future though, if it > >>> turns out to be more common than one would guess from looking at > content today. > >> > >> I think it will be a problem with the first implementation of this, > >> since we would want to add the information to the menu for audio > >> tracks just like for text tracks and the text tracks have this > >> information (kind, label, language). > >> > >> I guess we can wait till then, though, since it doesn't change > >> anything substantial about the way in which thing work. > >> > >> > >> > >>>>> | a group should be able to loop over the full multitrack rather > >>>>> | than a single slave > >>>>> > >>>>> Not sure what this means. > >>>> > >>>> We discussed the looping behaviour. To make it symmetrical with > >>>> in-band multitrack resources, it would make sense to be able to > >>>> loop over composed multitrack resources, too. The expected looping > >>>> behaviour is that a loop on the composed resource loops over the > >>>> composite as a whole. So, the question is then how to turn such > looping on. > >>>> > >>>> The proposal is that when one media element in the group has a > >>>> @loop attribute, that would turn the looping on the composite > resource on. > >>>> This means that when the loop is set and the end of the composite > >>>> resource is reached (its duration), the currentTime would be reset > >>>> to its beginning and playback of the composite resource would start > again. > >>>> Looping on individual elements is turned off and only the composite > >>>> resource can loop. > >>> > >>> What's the use case? > >> > >> The same as for the loop attributes on a audio or video element. It's > >> a media resource and should work consistently to how other media > >> resources are handled. > >> > >> E.g. if I have a plugin that likes to turn all media elements to > >> looping for whatever reason (entertain the kids? ;-), I can do that > >> for normal media elements and for in-band multitrack consistently > >> with the loop attribute, but I have to make an exception for composed > >> multitrack, because it doesn't allow for the handling of a loop > >> attribute. (stupid example, I know: so pick something with music..) > >> > >> > >> > >>>>> | some attributes of HTMLMediaElement are missing in the > >>>>> | MediaController that might make sense to collect state from the > >>>>> | slaves: error, > >>>>> > >>>>> Errors only occur as part of loading, which is a per-media-element > >>>>> issue, so I don't really know what it would mean for the > >>>>> controller to have it. > >>>> > >>>> The MediaController is generally regarded as the state keeper for > >>>> the composite resource. > >>> > >>> It is? That's certainly not how it's defined. It's just a central > >>> controller, it doesn't keep any of the state for the resources. > >> > >> Not for the individual ones, but for the combined construct. E.g. you > >> can ask it for what the currentTime or the combined construct is, > etc. > >> > >> > >>>> So, what happens when a single slave goes into error state. Does > >>>> the full composite resource go into error state? Or does it ignore > >>>> the slave > >>>> - turn it off, and continue? > >>> > >>> Media elements don't really have an error state. They have a > >>> networkState and a readyState, which affect the MediaController, but > the 'error' > >>> attribute is just for exposing the last error for events, it's not > >>> part of the state machine. > >> > >> That still doesn't answer the question: what happens if one of the > >> slaves happens to have a network error and cannot continue playing, > >> because it runs out of data. Does the combined resource stall? > >> Forever? Is there a way for script to identify this and remove the > >> stalling slave from the group? Maybe we need an onerror event on the > >> MediaController, which will be raised if one of the slaves has an > >> error fetching the media data. Then the script developer can go > >> through the list of slaves in the one callback and remove the > >> contender. > >> > >> > >>>>> | readyState > >>>>> > >>>>> I could expose a readyState that returns the lowest value of all > >>>>> the readyState values of the slaved media elements, would that be > useful? > >>>>> It would be helpful to see a sample script that would make use of > >>>>> this; I don't really understand why someone would care about doing > >>>>> this at the controller level rather than the individual track > level. > >>>> > >>>> I think it makes sense, in particular when script is waiting for > >>>> all elements to go to HAVE_METADATA state, which is often the case > >>>> when you are trying to do something on the media resource, but have > >>>> to wait until it's actually available. > >>>> > >>>> An example JS would be where you are running your own controls for > >>>> the combined resource and want to determine the combined duration > >>>> and volume for visual display, e.g. > >>>> > >>>> video.controller.addEventListener("loadedmetadata", init, > >>>> false); > >>>> function init(evt) { > >>>> duration.innerHTML = video.controller.duration.toFixed(2); > >>>> vol.innerHTML = video.controller.volume.toFixed(2); > >>>> } > >>>> > >>>> So, I think a combined readyState makes sense in the way you > described. > >>> > >>> That example doesn't use readyState at all. Is there a use case for > >>> readyState specifically? > >> > >> > >> We actually discussed in the last call whether we only need the > >> events or also readyState. Eric had an example where you would raise > >> an event, but only do something if the element is only in a > >> particular readyState at the time of processing. I don't remember > >> exactly what it was. My position is that we really need the events, > >> but I could live without having a combined readyState. > >> > >> > >>>>> | (this one is particularly important for onmetadatavailable > >>>>> | events) > >>>>> > >>>>> The events are independent of the attributes. What events would > >>>>> you want on a MediaController, and why? Again, sample code would > >>>>> really help clarify the use cases you have in mind. > >>>> > >>>> Maybe a onmetadatavailable event is more useful than a readyState > then? > >>> > >>> I've updated the spec to fire a number of events on MediaController, > >>> including 'metadataavailable' and 'playing'/'waiting'. > >> > >> Ah excellent. That's great. > >> > >> > >>>> I am not aware of many scripts that use the readyState values > >>>> directly for anything, even on the media elements themselves. > >>> > >>> One example of readyState usage on a media element would be a > 'waiting' > >>> event handler that checks whether readyState is HAVE_CURRENT_DATA or > >>> HAVE_METADATA and uses that information to decide whether to display > >>> a poster frame overlay or not. > >>> > >>> > >>>>> | TimeRanges played > >>>>> > >>>>> Would this return the union or the intersection of the slaves'? > >>>> > >>>> That would probably be the union, because those parts of the > >>>> timeline are what the user has viewed, so he/she would expect them > >>>> to be marked in manually created controls. > >>> > >>> Ok, added. > >>> > >>> > >>>>> | ended > >>>>> > >>>>> Since tracks can vary in length, this doesn't make much sense at > >>>>> the media controller level. You can tell if you're at the end by > >>>>> looking at currentTime and duration, but with infinite streams and > >>>>> no buffering the underlying slaves might keep moving things along > >>>>> (actively playing) with currentTime and duration both equal to > >>>>> zero the whole time. So I'm not sure how to really expose 'ended' > >>>>> on the media controller. > >>>> > >>>> "ended" on the individual elements (in the absence of loop) returns > >>>> true when > >>>> > >>>> Either: > >>>> > >>>> The current playback position is the end of the media resource, > >>>> and > >>>> The direction of playback is forwards. > >>>> > >>>> Or: > >>>> > >>>> The current playback position is the earliest possible position, > >>>> and > >>>> The direction of playback is backwards. > >>> > >>> No, 'ended' only fires when going forwards. > >> > >> > >> I quoted from the spec: > >> http://www.whatwg.org/specs/web-apps/current-work/multipage/video.htm > >> l#ended-playback > >> > >> > >>>> So, in analogy, for the composed resource: it would return the > >>>> union of the ended result on all individual elements, namely > >>>> "ended" only when all of them are in ended state. > >>> > >>> But what's the use case? > >> > >> If I reach the end, I want to present something different, such as a > >> post-roll add or an overlay with links to other videos that are > >> related. It is much easier to wait on a onended event on the combined > >> resource than having to register an event handler with each slave and > >> then try and combine the result. > >> > >> > >>>>> | and autoplay. > >>>>> > >>>>> How would this work? Autoplay doesn't really make sense as an IDL > >>>>> attribute, it's the content attribute that matters. And we already > >>>>> have that set up to work with media controllers. > >>>> > >>>> As with @loop, it would be possible to say that when one media > >>>> element in the union has @autoplay set, then the combined resource > >>>> is in autoplay state. > >>> > >>> I don't understand the use case for exposing this as an IDL > >>> attribute on the controller. > >> > >> Same use case as for any other media element - and then to have it > >> consistent, so that we can use the same code to deal with grouped > >> media elements as with in-band multitrack elements. For @autoplay it > >> even has an accessibility use case: it's possible for a UA or plugin > >> to provide settings to stop autoplay on media elements with the > >> @autoplay IDL attribute. It's not easily possible to stop hand-coded > >> play() calls, which would be the only way for grouped multitrack > >> media in the way in which it is currently specified. > >> > >> > >> > >>>> One more question turned up today: is there any means in which we > >>>> could possibly create @controls (with track menu and all) for the > >>>> combined resource? Maybe they could be the same controls on all the > >>>> elements that have a @controls active, but would actually be driven > >>>> by the controller's state rather than the element's? Maybe the > >>>> first video element that has a @controls attribute would get the > >>>> full controller's state represented in the controls? Could there be > >>>> any way to make @controls work? > >>> > >>> The UA is responsible for this, but the spec requires that the UI > >>> displayed for a control that has a controller control the > controller. > >> > >> That's good. Question of clarification: Does that mean that for all > >> elements in a group that display controls these controls actually > >> control the controller? Or do only the controls of the media element > >> that created the controller control the controller? > >> (hope that makes sense to you ;-) > >> > >> > >> Cheers, > >> Silvia. > >> > >> > >
Received on Wednesday, 20 April 2011 14:52:52 UTC