- From: Ian Hickson <ian@hixie.ch>
- Date: Wed, 20 Apr 2011 23:01:53 +0000 (UTC)
- To: HTML Accessibility Task Force <public-html-a11y@w3.org>
- Message-ID: <Pine.LNX.4.64.1104202008520.19153@ps20323.dreamhostps.com>
On Wed, 13 Apr 2011, Sean Hayes wrote: > > > > Use <video> element for the commentary audio, with the appropriate > > captions specified, and layer it on top of the main <video>. > > Interesting approach. It's a somewhat misnamed use of a video element to > supply audio and captions but not video. I could argue that captions are video, as others have done, but no, I agree, it's misnamed. So's the rest of HTML, though. Even "HTML" is a misnomer at this point. :-) So I wouldn't worry about that. > Audio files don't work today in two of the major browsers; and It's also > far from clear to me that it the spec currently allows it, but if you > say as editor it does, then I guess it could be made to work. Nothing in the spec distinguishes <video> from <audio> except that <video> has a visual rendering area. > I think we will need to change the spec to indicate more clearly that > the <video> element is supposed to work if there is no video data > supplied. > > For example, change: > "A video element is used for playing videos or movies." > To > "A video element is used for playing videos or movies or audio". That's non-normative text, but I've updated it and other parts of the spec to clarify that both elements can be used for either audio and video. > And replacing > "The video element is a media element whose media data is ostensibly > video data, possibly with associated audio data" > > With > "The video element is a media element whose media data is ostensibly > video data, audio data or possibly video data with associated audio > data." The word "ostensibly" was very carefully chosen there. It is _ostensibly_ video data, that's why the element is called "video", after all. > I'm also not clear if the section on "Media elements", which is > indicated to 'apply equally to video and audio', means that if I supply > a video to an audio element; that is supposed to play the audio data > from it? Yes. > Can I create a display rectangle with CSS for an audio element to > display video data? No. > One interesting side effect of this approach is that it gives the > ability to make free standing timed text elements, since I can just do > <video src=24hrsOfSilence.mp3> <track src=captions.cues ></video>. Yes. > Would be nicer to have a shorthand <cues> element for that, but I guess > not much point trying to make HTML logical at this stage. Indeed. On Wed, 13 Apr 2011, Eric Carlson wrote: > > > > For example, change: > > "A video element is used for playing videos or movies." > > What is the difference between a "video" and a "movie"? A movie is generally the kind of thing you see in a cinema. A video is a vaguer term that includes movies by definition but is more often used to refer to content such as that seen on YouTube or Vimeo. On Mon, 18 Apr 2011, Silvia Pfeiffer wrote: > > (1) videoTracks should be MultipleTrackList, too: > > The current HTMLMediaElement has the following IDL to expose in-band > media tracks: > readonly attribute MultipleTrackList audioTracks; > readonly attribute ExclusiveTrackList videoTracks; > > The objection is to the use of ExclusiveTrackList on videoTracks. It > should be allowed to have multiple in-band video tracks activated at the > same time. In particular it seems that MP4 files have a means of > specifying how multiple video tracks should be displayed on screen and > Safari is already able to display such. > > In contrast, proposal 4 requires that only one in-band video track can > be active and displayed into the video viewport at one time. If more > than one video track is to be displayed, it needs to be specified with a > media fragment URI in a separate video element and connected through a > controller. > > Some questions here are: what do other browsers want to do with multiple > in-band video tracks? Does it make sense to restrict the display to a > single video track? Or should it be left to the browser what to do - in > which case a MultipleTrackList approach to videoTracks would be > sensible? If MultipleTrackList is sensible for audio and video, maybe it > could further be harmonized with TextTrack. On Mon, 18 Apr 2011, Philip Jägenstedt wrote: > > I think that videoTracks should be an ExclusiveTrackList, to avoid having to > invent a layout system for displaying multiple video streams in a single > <video>. Even if WebM supported positioning video tracks similar to what > MPEG-4 presumably can, this layout will likely not be good enough in most > cases anyway, as you can't do fancy borders and likely couldn't do things like > move the videos around by mouse dragging in JavaScript. Indeed. It's also unclear to me what it would mean to enable multiple in-band video tracks and then disable them in the opposite order, if the last one enabled had a different geometry than the first one. What would the rendering be? I don't mind the UA exposing this to the user, but from the perspective of the script it seems like a rather complicated problem. On Mon, 18 Apr 2011, Silvia Pfeiffer wrote: > > (2) interface on TrackList: > > The current interface of TrackList is: > readonly attribute unsigned long length; > DOMString getName(in unsigned long index); > DOMString getLanguage(in unsigned long index); > attribute Function onchange; > > The proposal is that in addition to exposing name and language > attributes - in analogy to TextTrack it should also expose a label and > a kind. > The label is necessary to include the track into menus for track > activation/deactivation. Name and label are the same. > The kind is necessary to classify the track correctly in menus, e.g. > as sign language, audio description, or even a transparent caption > track. I'm fine with exposing kind; is there any documentation on what video formats expose for this? > (3) looping should be possible on combined multitrack: > > In proposal 4 the loop attribute on individual media elements is > disabled on multitrack created through a controller, because it is not > clear what looping means for the individual element. > > However, looping on a multitrack resource with in-band tracks is well > defined and goes over the complete resource. It's not especially well-defined, since there's no concept of "ending" with the controller, given how streaming is handled. But more importantly, what are the use cases? The use case for looping a single track is things like this: http://www.google.com/green/ ...but I don't see why you would use a MediaController to do that kind of thing. It's not like you'd want the multiple videos there in sync, they're just background. I'm also skeptical of introducing loop at the MediaController level even in the simple case of finite resources, because it's not clear how to make it work with looping subresources. Say you had two resources, both set to loop, one of which was 5s and one 3s, and that you then further say that the whole thing should loop. What should happen? We don't want to define MediaController looping in a way that precludes that from being possible, IMHO, at least not unless we have a strong use case. > In analogy, it makes sense to interpret loop on a combined multitrack > resource in the same way. Thus, the controller should also have a muted > attribute which is activated when a single loop attribute on a slave > media element is activated and the effect should be to loop over the > combined resource, i.e. when the duration of the controller is reached, > all slave media elements' currentTime-s are reset to > initialPlaybackPosition. Why would an attribute on any one of the <video>s affect the MediaController as a whole? Why would they jump back to initialPlaybackTime? I don't think this makes sense. > (4) autoplay should be possible on combined multitrack: > > Similar to looping, autoplay could also be defined on a combined > multitrack resource as the union of all the autoplay settings of all the > slaves: if one of them is on autoplay, the whole combined resource is. Actually currently autoplay is the only behaviour; MediaControllers start off playing and just wait for any autoplaying resources to be ready. If none of the resources are autoplaying the controller just advances without anything playing. This is probably suboptimal. I guess we could say that if none of the resources have autoplay enabled it doesn't play, but how would you handle dynamic changes to the set of slaved media elements? > (5) more events should be available for combined multitrack: > > The following events should be available in the controller: > > * onloadedmetadata: is raised when all slave media elements have > reached at minimum a readyState of HAVE_METADATA > > * onloadeddata: is raised when all slave media elements have reached > at minimum a readyState of HAVE_CURRENT_DATA > > * canplaythrough: is raised when all slave media elements have reached > at minimum a readyState of HAVE_FUTURE_DATA These are supported now. > * onended: is raised when all slave media elements are in ended state This isn't supported; a media controller can't be "ended" currently. > (6) controls on slaves control the combined multitrack: > > Proposal 4 does not provide any information on what happens with media > elements when the @controls attribute is specified. The user interface section covers this already. On Mon, 18 Apr 2011, Philip Jägenstedt wrote: > > Apart from this, I think that TrackList really should be a list of Track > objects or similar, so that getName/getLanguage are pushed onto that > object as .name and .language, just like on TextTrack. This would balloon the number of objects in the platform. Unless there's a very good reason to have objects for each of these tracks, we should avoid doing that. > I have no strong opinion, but we should have consistency such that > changing the paused attribute (e.g. by calling play()) has the exact > same effect. It's not clear to me what the spec thinks should happen > when play() is called on a media element with a controller. Could you elaborate on what is unclear? On Mon, 18 Apr 2011, Silvia Pfeiffer wrote: > > Onended is important to do something once the video or audio resource is > finished playing, such as display related videos, or display a post-roll > ad. That seems like something you'd want to do on a per-<video> basis, not for the whole MediaController, surely. On Tue, 19 Apr 2011, Silvia Pfeiffer wrote: > > As it actually turns out: getId() is required to discover the uniquely > identifying name of a track through which we can create a track media > fragment URI. > > The issue here is that sometimes a Web page author does not actually > know what tracks are available in-band in a loaded multitrack media > resource. Thus, they need to use script to discover the tracks and their > functionality. For example, when they discover a sign language track, > they would want to create a slave video element with the media fragment > URI to that sign language track. The unique identifier of that track is > given through an ID and therefore needs to be discoverable. I've added getID(). > If you add through script an already playing media element to a media > controller that is not yet playing, which one wins? Will the new > combined resource be playing or will it be paused? This is already defined. > If the answer is paused, then the same could apply to autoplay: the > element that creates the controller defines the autoplay state of the > controller - any added element cannot override that and their autoplay > attributes is ignored. Note that you can create a MediaController without an element. The mechanism is designed so that none of the slaves are in any way special compared to each other. On Tue, 19 Apr 2011, Sean Hayes wrote: > > It occurs to me that what the author really needs is a > 'getFragmentUrl()' function which returns a media fragment URL that > addresses that track. Although getId() apparently provides enough > information to construct one, it would seem more robust, and possibly > more secure, if the UA provided this functionality directly. That's not generally possible since not all formats support the "Media Fragment URI" fragment identifier URL syntax. Only formats that are defined to support it can use it. I'm not sure I understand the robustness or security argument. Can you elaborate on the algorithm you envisage this method using? > media2.controller = media1.controller; > > So if media1 is playing, then adding its controller to media2 will cause > media2 to start following its timeline and thus play (unless apparently > if media2 is paused, which then blocks the controller; although I don't > find that very intuitive). Being paused doesn't block the controller. > Conversely if media1 is not playing then its timeline will not be > advancing and so neither will media2's. Right. > Thus similarly after the assignment, 'autoplay' and 'loop' on media1 > will have controlled indirectly the behavior of media2; regardless of > what those attributes were on media2. "loop" will have no effect when a media controller is present, currently. "autoplay" will delay the media controller while the resource is loading. > I personally think that the controller mechanism has the possibility to > simplify the model, if we were to restructure the chapter so that rather > than a bolt on afterthought, a controller is always created - even for > singleton media groups - and define that the functionality currently > defined on a media element is actually a pass through to its controller. > All the media functionality can then be defined in terms of controllers, > there would then be no need for an explicit constructor for a > controller, and to slave two elements together the code above is all > that would be needed. > > For example, consider: > media3.play(); //media 3 playing > media4.pause(); //media 4 not playing > media4.controller = media3.controller; // media 3 and media 4 now playing. > media4.pause(); // media 3 and media 4 now pause. I strongly disagree that we should have methods on an element that affect another element in the same way as the equivalent method on the other element. That's terrible API design. An object should affect itself, and may have side-effects on another object, but simply having multiple redundant APIs that all do exactly the same thing is quite confusing. > In the controller model, line 4 above would be a pass through to its > controller, which now happens to be the controller created for media3; > and so the group as a whole stops playing. That seems simple and > intuitive to me. It seems quite the opposite to me! > That would however require something of re-write of the media chapter > which may not be feasible before LC. That's a non-issue. What matters is what is implemenet.d The only reason we're rushing here is because this was escalated to an issue; if we simply retract all our proposals the chairs will declare the issue closed without prejudice and we'll be able to do things at whatever pace we need. Even if they don't, if the implementors all want to do something different that what they decide, what they decide doesn't much matter. On Tue, 19 Apr 2011, Sean Hayes wrote: > > Actually if you need to implement both a controller and media playback > then it's fairly likely you'd use one codebase for the common parts of > both, this would just pass that design on into the JS API so it actually > removes a layer of complexity. > > There would be only controller state; whether its controlling one or > multiple elements. The combined state is the state of the controller and > only the state of the controller, there is nothing else; you don't need > to be concerned about the states of the individual elements because they > wouldn't have individual states (except for their network state, which I > also suggested last week should be moved out into a separate and > sharable object to handle tracks coming from the same network object), > thus there is less to reconcile and it's simpler all round. I agree that if we were designing this from scratch again, it would make sense to split out the network side and the playback side. Unfortunately with the Web we only get to redesign things until they are implemented, and this stuff has been implemented and shipped multiple times now. On Tue, 19 Apr 2011, Philip Jägenstedt wrote: > > If an aggregate readyState is added, we must make sure that we don't add > yet more race conditions to the spec. Specifically, when an aggregate > loadedmetadata event is fired, it should be guaranteed that all of the > slave elements have already fired loadedmetadata That's what's specced. > and are still in HAVE_METADATA, not allowing their readyState to reflect > the real state of the decoding that is going on asynchronously. > Otherwise, the readyState of the slaves will not only be unpredictable, > they might also not even be different from slave to slave. I'll make sure to keep the MediaController side of this consistent with whatever we decide for readyState on elements as a result of the bugs you mentioned. On Mon, 18 Apr 2011, Sean Hayes wrote: > > If the clarifications I proposed last week are included (i.e. to > indicate expressly that the <video> element operates on audio only data, > and has a display rectangle to renders captions into, even if there is > no video data supplied), then I would formally withdraw my CP in favor > of Proposal 4 as amended. I've made the change. Let me know if there's anything else you think needs clarifying on this front. On Wed, 20 Apr 2011, Silvia Pfeiffer wrote: > > > > Looking at the metadata list cited above, I don't see anything in > > either ogg, mp4, or webm that maps to "kind", so I don't see much > > point exposing that on the audio/video track lists, though I agree > > that in principle it would be a good idea. > > Also, they actually have a "role" attribute on the "fragment" which they > suggest using for identifying the "kind" of a "track": > http://www.w3.org/TR/mediaont-10/#example4 . So, it is indeed there. > Hmm.. given this, I should probably change what is written for "OGG" > under "fragment", because certainly Ogg has fields that provide the kind > of a track. WebM and MP4 have them, too. What I'm concerned about is what the fields are in the video, not what the RDF ontology exposes. I see nothing in the tables about what media elements expose that references the kind of track. Is there documentation about how these formats expose this track classification? > > Realistically though, for in-band tracks it's more likely that that > > data will be provided to the script out-of-band so that it can > > construct the UI before the movie loads, and for out-of-band tracks > > the information can be made available in the markup (e.g. using data-* > > attributes). For UA-driven menus, the title is probably sufficient for > > most purposes, and that can already be made available. > > The biggest issue with this approach is discoverability. A Web developer > that has to deal with multiple resources for which he doesn't a-priori > know what kinds of tracks they have available in-band has no chance to > find this out through script if there is no interface that exposes this > information. It would need to be done server-side. I don't see much problem with doing it server-side, but I have no problem exposing it to scripts too if we can find documentation I can use to work out how to expose it. > >> > Note that for media-element-level data, you can already use data-* > >> > attributes to get anything you want, so the out-of-band case is > >> > already fully handled as far as I can tell. > >> > >> Interesting. In the audio description case, would a label, kind, and > >> language be added to the menu of the related video element? > > > > For scripted UIs, that's up to the script. > > > > For UA UIs, it depends if we are talking about multiple video tracks > > or multiple audio tracks. Multiple video tracks aren't handled, > > because there's no sane way to have the UA turn the video tracks on > > and off. For the audio case, I don't really see much reason to expose > > more than a title. A kind could be used but it's going to be used so > > rarely that in practice the UA will want to handle the case of only > > having a title anyway, and once you support that, it's not clear what > > a kind would really do to make things better. > > > > It's something we can always provide in the future though, if it turns > > out to be more common than one would guess from looking at content > > today. > > I think it will be a problem with the first implementation of this, > since we would want to add the information to the menu for audio tracks > just like for text tracks and the text tracks have this information > (kind, label, language). For text tracks, there is a use for the information: metadata, text audio descriptions, and captions have very different handling. The distinction between "caption" and "subtitle" is not one that we need, it could have been done entirely in the label -- I only exposed it because we already had to have a kind for other reasons. So I don't think there is a parallel here. > >> We discussed the looping behaviour. To make it symmetrical with > >> in-band multitrack resources, it would make sense to be able to loop > >> over composed multitrack resources, too. The expected looping > >> behaviour is that a loop on the composed resource loops over the > >> composite as a whole. So, the question is then how to turn such > >> looping on. > >> > >> The proposal is that when one media element in the group has a @loop > >> attribute, that would turn the looping on the composite resource on. > >> This means that when the loop is set and the end of the composite > >> resource is reached (its duration), the currentTime would be reset to > >> its beginning and playback of the composite resource would start > >> again. Looping on individual elements is turned off and only the > >> composite resource can loop. > > > > What's the use case? > > The same as for the loop attributes on a audio or video element. I don't see how MediaController applies to those use cases. For example, for audio, a use case is background music in a game. No controller. For video, a use case is the effect seen on the aforementioned Google page. Again, no controller. > E.g. if I have a plugin that likes to turn all media elements to > looping for whatever reason (entertain the kids? ;-) For what reason? The reason is the question. Without a concrete reason, there's no use case. I don't really see how a MediaController looping will entertain kids. > >> > > some attributes of HTMLMediaElement are missing in the > >> > > MediaController that might make sense to collect state from the > >> > > slaves: error, > >> > > >> > Errors only occur as part of loading, which is a per-media-element > >> > issue, so I don't really know what it would mean for the controller > >> > to have it. > >> > >> The MediaController is generally regarded as the state keeper for the > >> composite resource. > > > > It is? That's certainly not how it's defined. It's just a central > > controller, it doesn't keep any of the state for the resources. > > Not for the individual ones, but for the combined construct. E.g. you > can ask it for what the currentTime or the combined construct is, etc. You can now, yeah. It does now have state, though I personally prefer the original stateless design. :-) > >> So, what happens when a single slave goes into error state. Does the > >> full composite resource go into error state? Or does it ignore the > >> slave - turn it off, and continue? > > > > Media elements don't really have an error state. They have a > > networkState and a readyState, which affect the MediaController, but > > the 'error' attribute is just for exposing the last error for events, > > it's not part of the state machine. > > That still doesn't answer the question: what happens if one of the > slaves happens to have a network error and cannot continue playing, > because it runs out of data. Does the combined resource stall? Forever? Yes. > Is there a way for script to identify this and remove the stalling slave > from the group? It can look at the states of all the slaves, sure. > Maybe we need an onerror event on the MediaController, which will be > raised if one of the slaves has an error fetching the media data. Stalling isn't an error, so that wouldn't help with this example. > Then the script developer can go through the list of slaves in the one > callback and remove the contender. That's not a likely scenario, IMHO. What track is unimportant enough that the user will be fine with just playing without it? More likely they'll reload the page and try again. > >> "ended" on the individual elements (in the absence of loop) returns > >> true when > >> > >> Either: > >> > >> The current playback position is the end of the media resource, and > >> The direction of playback is forwards. > >> > >> Or: > >> > >> The current playback position is the earliest possible position, and > >> The direction of playback is backwards. > > > > No, 'ended' only fires when going forwards. > > I quoted from the spec: > http://www.whatwg.org/specs/web-apps/current-work/multipage/video.html#ended-playback Sorry, I thought we were talking about the event, not the attribute. Given the way the MediaController is defined, you can just compare the current playback position with its duration to determine if it's at the end of the media resources. > >> So, in analogy, for the composed resource: it would return the union > >> of the ended result on all individual elements, namely "ended" only > >> when all of them are in ended state. > > > > But what's the use case? > > If I reach the end, I want to present something different, such as a > post-roll add or an overlay with links to other videos that are related. > It is much easier to wait on a onended event on the combined resource > than having to register an event handler with each slave and then try > and combine the result. I'm confused. Are we talking about the event or the attributed? You seem to be arguing for the attribute but giving use cases for the event. If you want to display an overlay at the end of a video, it seems like you'd want to do that as soon as the video ended, you wouldn't want to wait until the end of the entire resource, no? So you'd want onended on the <video> element, not the controller. > >> > | and autoplay. > >> > > >> > How would this work? Autoplay doesn't really make sense as an IDL > >> > attribute, it's the content attribute that matters. And we already > >> > have that set up to work with media controllers. > >> > >> As with @loop, it would be possible to say that when one media > >> element in the union has @autoplay set, then the combined resource is > >> in autoplay state. > > > > I don't understand the use case for exposing this as an IDL attribute > > on the controller. > > Same use case as for any other media element - and then to have it > consistent, so that we can use the same code to deal with grouped media > elements as with in-band multitrack elements. But you wouldn't use IDL to deal with autoplay in the in-band case, so it wouldn't be consistent at all. > For @autoplay it even has an accessibility use case: it's possible for a > UA or plugin to provide settings to stop autoplay on media elements with > the @autoplay IDL attribute. Now I'm really confused. Are you talking about the IDL attribute or the content attribute? The @foo syntax usually implies the content attribute, and the UA autoplay-stopping behaviour is typically going to be based on the content attribute, but you say the IDL attribute. > It's not easily possible to stop hand-coded play() calls, which would be > the only way for grouped multitrack media in the way in which it is > currently specified. As currently specified, grouped tracks always autoplay any tracks that are labeled autoplay, and those that are not simply don't play (while the controller's position advances regardless). > >> One more question turned up today: is there any means in which we > >> could possibly create @controls (with track menu and all) for the > >> combined resource? Maybe they could be the same controls on all the > >> elements that have a @controls active, but would actually be driven > >> by the controller's state rather than the element's? Maybe the first > >> video element that has a @controls attribute would get the full > >> controller's state represented in the controls? Could there be any > >> way to make @controls work? > > > > The UA is responsible for this, but the spec requires that the UI > > displayed for a control that has a controller control the controller. > > That's good. Question of clarification: Does that mean that for all > elements in a group that display controls these controls actually > control the controller? That's up to the UA. > Or do only the controls of the media element that created the controller > control the controller? (hope that makes sense to you ;-) No media element is special. On Wed, 20 Apr 2011, Silvia Pfeiffer wrote: > > I have thus far come up with the following: > > video: > * sign language video (in different sign languages) > * captions (as in: burnt-in video that may just be overlays) > * different camera angle > > audio: > * audio descriptions > * language dub We should derive these from the kinds that are exposed in media formats, it doesn't make sense for us to come up with them. On Wed, 20 Apr 2011, Philip Jägenstedt wrote: > > > > I've updated the spec to fire a number of events on MediaController, > > including 'metadataavailable' and 'playing'/'waiting'. > > This seems to make the race conditions in HTMLMediaElement.readyState > slightly worse than they already were. At the point when the > loadedmetadata event is fired on the MediaController, the readyState of > the slaved media elements could be just about anything, and might not > even be the same on all of them. These open bugs are about race > conditions with readyState and networkState: > > http://www.w3.org/Bugs/Public/show_bug.cgi?id=11981 > http://www.w3.org/Bugs/Public/show_bug.cgi?id=12175 > http://www.w3.org/Bugs/Public/show_bug.cgi?id=12267 > > These open bugs are about other readyState issues: > > http://www.w3.org/Bugs/Public/show_bug.cgi?id=12195 > http://www.w3.org/Bugs/Public/show_bug.cgi?id=12529 > > I'd very much like to see at least the overall issue of race conditions > resolved. For these aggregate events on MediaController, we would have > to make sure that they are fired in the same task as the last media > element changes its readyState and fires the corresponding event. I'll make sure to keep the MediaController spec consistent with whatever happens in those bugs. > > The use case Silvia suggests seems reasonable (marking on the timeline > > what has been played), why is it not good? > > I've boycotted HTMLMediaElement.played by not implementing it and so far > I've never heard a single request for it. I've also never seen controls > that expose what has already been played, only what is currently > buffered. I know this has been discussed before, but I can't find it in > the archives. Then the use case was something like showing or not > showing ads depending on what had been watched, I think. IMO, in the > absence of compelling use cases it should be removed from both > HTMLMediaElement and MediaController. One thing speaking against that is > that it's already implemented in WebKit, of course. I don't feel particularly strongly about this one way or the other. It seems useful to me (to highlight the played parts on a timeline, to keep track of which segments should have ads shown in them and which should not because the user is just going back to a previously watched part again), and if scripts can do it then presumably it's just as easy for UAs to do it, but also, if scripts can do it then it's not really needed in the API. I'm happy to follow implementations on this. -- Ian Hickson U+1047E )\._.,--....,'``. fL http://ln.hixie.ch/ U+263A /, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
Received on Wednesday, 20 April 2011 23:02:19 UTC