- From: Silvia Pfeiffer <silviapfeiffer1@gmail.com>
- Date: Sat, 6 Mar 2010 11:54:15 +1100
- To: Dick Bulterman <Dick.Bulterman@cwi.nl>
- Cc: Eric Carlson <eric.carlson@apple.com>, Michael Smith <mike@w3.org>, HTML Accessibility Task Force <public-html-a11y@w3.org>
Hi Dick, Thanks for checking back with the SMIL specification on this proposal. It is good to do this now for two reasons: firstly to check if SMIL already has a construct that satisfies the needs, and secondly to see if SMIL has some functionality that we have missed for the track proposal. So, let me give this a thorough analysis. On Fri, Mar 5, 2010 at 11:10 PM, Dick Bulterman <Dick.Bulterman@cwi.nl> wrote: > On the track proposal, just to make sure I'm not missing something: > Is there an implied preference order in the statements: >> >> <trackgroup media="accessibility(captions:yes") > >>> >>> <track src="en.srt" lang="en" enabled > >>> <track src="fr.srt" lang="fr" > >>> <track src="de.srt" lang="de" > >>> </trackgroup> > > (In other words, the implied preference order is English, French, German.) > > Compare this to the SMIL way of doing the same thing: > <par> > <video src="..." /> > <switch systemCaptions="on" allowReorder="yes"> > <textstream src="en.xxx" systemLanguage="en" /> > <textstream src="fr.xxx" systemLanguage="de" /> > <textstream src="de.xxx" systemLanguage="de" /> > </switch> > </par> > The default behavior is that the first candidate matching a set language > preference is used. The 'allowReorder' attribute explicitly allows a user > agent the reorder the order of options if the user (via the UA) has > determined that he/she prefers German over French. Yes, the DOM in HTML has indeed a given order and that is tree order. This is important for allowing scripting languages and things such as xpath to be able to work on a document in a predictable way. Reordering in HTML is not done using an attribute. If there is a need for reordering, one uses JavaScript. Thus, the @allowReorder attribute is not required for HTML. > Note also that in this example, the entire <switch> is only evaluated if the > user (agent) has determined that captions are required. In the given proposal, the track elements will always be in the DOM and will always be parsed and evaluated by the UA. However, whether the external resource is loaded or not is described by the resource selection algorithm (http://www.w3.org/WAI/PF/HTML/wiki/Media_TextAssociations#Resource_selection_algorithm). I do not think this is a fundamental difference, though - if we were to adopt the SMIL markup (and I am not suggesting we should - I am just hypothetically analysing this situation), it would need to work this way because that's how HTML works. > Note finally that if > the user has the language preference Dutch, no captions will play (since he > presumably can't understand them anyway). Having a final statement in a > <switch> without a predicate determines a result that will allows play if no > earlier option (reodered or not) do not provide a preference match. In our case, this is something that is up to the Web author and the UA's decisions for presentation. If the Web author decides to put a default "enable" on one of the tracks, then that track will be presented to anyone unless their browser preference settings indicate that they want the German or the French track. Also, there will be a menu through which users can activate other tracks. If the Web author "enables" none of the tracks, no subtitles will be shown unless you have a browser preference setting that indicates that you always want subtitles on when they are available in your language. Those browser preference settings, incidentally, are just a recommendations for the browser vendors on how to implement support for these elements. What browsers actually do is not something that is defined in HTML. > Are the semantics of <trackgroup> similar? (If so, why invent something new; > if not, are at least the SMIL semantics supported?) This is a much more general question than the specific questions above. So, let me reply to it in detail. If we were to introduce the SMIL approach, we would require the introduction of the following elements: * par * switch * textstream instead of introducting: * track * trackgroup Let's start with the "easy" comparison: track vs textstream. We had lengthy discussions about whether we should add an element that explicitly only links to external text streams or is able to also be applied to other types of content, such as external audio or video tracks. The consensus was that making it a general element would be a lot more appropriate. The @type attribute would in any case hint at what kind of resource is being linked to, so it doesn't need to be explicit in the element name. Thus, track is a lot more generic than textstream and it's not appropriate to adopt textstream for this use case. Now the more complicated comparison: trackgroup vs switch The switch element is indeed used for a similar purpose as the trackgroup element here: allowing only one out of the list of elements inside it to be active. The switch element has an @id and a @title attribute. The trackgroup element has several more attributes that signify is purpose for grouping tracks that have something in common. It thus inherits most of the attributes from the track element. To give switch the same purpose, it would have needed to be extended from the SMIL specification. Further: with the chosen name we reused a name that is already being used in MPEG for signifying alternate tracks. I don't think renaming trackgroup to switch will earn us much, but it would of course be an option. Finally the extra element: par par is required in SMIL because SMIL is good for compositing media resources together. Thus, the video element and the textstream elements are composited together as parallel resources. seq and switch contribute to that functionality, too and really important for something as flexible as SMIL. In HTML5, a media element is not regarded as a composited resource. There is a main resource and it is the important bit - everything else is just additional information on top of that. Or speaking concretely in our example: the external tracks make no sense without the video element. Therefore, there is a dependency between the track elements and the video element, which is expressed by having the video element be the parent element. This makes total sense in HTML5, but no sense in SMIL at all. This is the reason why par is not necessary in HTML5: there are no parallel resources that enjoy equal rights. There is a main resource and it dominates all other linked resources, all external text associations and all other associated media. Its duration defines the timeline, defines the duration of the media element, defines the playback position, defines events etc. It's the master. There is no need for "par" in HTML, since the media element in itself is already the time master that "par" would be. So, in summary as a reply to your question: No, the semantics of {par, switch, textstream} are not identical to the semantics of {track, trackgroup}. A renaming of trackgroup to switch would be possible, but it would have different attributes than the SMIL switch element and thus would not be semantically equal either. There are reasons for the design decisions for {track, trackgroup} and why {par, switch, textstream} didn't satisfy the requirements. Best Regards, Silvia.
Received on Saturday, 6 March 2010 00:55:09 UTC