- From: Bob Lund <B.Lund@CableLabs.com>
- Date: Mon, 13 May 2013 20:06:33 +0000
- To: Silvia Pfeiffer <silviapfeiffer1@gmail.com>, Glenn Adams <glenn@skynav.com>
- CC: Simon Pieters <simonp@opera.com>, public-html <public-html@w3.org>, "Jerry Smith, (WINDOWS)" <jdsmith@microsoft.com>, "Mark Vickers @ Comcast" <mark_vickers@cable.comcast.com>
On 5/12/13 8:47 PM, "Silvia Pfeiffer" <silviapfeiffer1@gmail.com> wrote: >On Mon, May 13, 2013 at 5:00 AM, Glenn Adams <glenn@skynav.com> wrote: >> >> First, I'm talking about the Media Type of a text track resource here, >>not a specific @kind (usage) of a text track resource. For example, >>"text/vtt", >> "application/ttml+xml", "application/x-mpeg2-psi" [I just made that >>up], etc. > >OK, this is taking the discussion into a completely different and >unrelated direction, because we were discussing TextTrackCue and not >TextTrack types. Also, the changes you are proposing below are not >possible because <track> is an empty element and we are not going to >break backwards compatibility on the markup. But I'll entertain the >discussion of the use cases that they imply rather than the particular >specification proposal. > >I'm still curious about the one question I had before: are you or >anyone else aware of any implementations of the >inBandMetadataTrackDispatchType attribute? Since it's not even used in >http://www.cablelabs.com/specifications/CL-SP-HTML5-MAP-I02-120510.pdf >but instead @label is used, I don't know if it's satisfying its use >case. This document was written before the definition of the dispatch type attribute. The spec will be rev'd to make use of this new attribute. > > >> Now, let me try to be more concrete regarding uses: >> >> Ideally, <track> would use <source> in the same fashion as <video> and >> <audio>, in order to allow use of the resource selection algorithm for >> alternate track resources: >> >> <video src="video.mp2t"> >> <!-- in- or out-of-band captions, three alternative sources --> >> <track kind="captions"> >> <!-- out-of-band VTT --> >> <source src="video.vtt" type="text/vtt"> >> <!-- out-of-band TTML --> >> <source src="video.ttml" type="application/ttml+xml"> >> <!-- in-band 708 --> >> <source src="video.mp2t" type="application/x-cea-708"> >> </track> >> <!-- in-band MPEG-2 PSI, only one source --> >> <track kind="metadata" src="video.mp2t" >>type="application/x-mpeg2-psi" /> >> <!-- out-of-band custom metadata, two alternative sources --> >> <track kind="metadata"> >> <!-- out-of-band custom metadata, type 1 --> >> <source src="video.md1" type="application/x-metadata-1"> >> <!-- out-of-band custom metadata, type 2, in case type 1 not >>supported >> --> >> <source src="video.md2" type="application/x-metadata-2"> >> </track> >> </video> > >This is overly complicated and not necessary to do in markup, because >you get all of this in the JavaScript TextTrack API. Plus you do not >have to deal with special cases like the inband text tracks - markup >is the wrong approach for this. As for the suggestion of doing ><source> inside <track> - that is not necessary, because all supported >track formats are exposed in a track list to JS or even the user - >this is contrary to <video> where only a single @src is always active. >You can, however, achieve all the use cases that you are trying to >emulate in the complex markup above in JS right now. > >Here's how it's done with the current spec: > >Markup: ><video src="video.mp2t"> > <track kind="captions" src="video.vtt"> > <track kind="captions" src="video.ttml"> > <track kind="metadata" src="video.md1"> > <track kind="metadata" src="video.md2"> ></video> > >JavaScript: > >Assuming the browser can parse the following file formats: >* mp2t video file >* VTT file >* mp2t cea-708 inband track >* mp2t mpeg2-psi inband track >* md2 metadata file >But is unable to parse: >* TTML file >* md1 metadata file > >The following objects are available in JavaScript: >* for the WebVTT track: >TextTrack(kind="captions", cues=TextTrackCueList,...) >(the TextTrackCues in the TextTrackCueList are of type WebVTTCue) > >* for the TTML track (because there is no support for the format): >TextTrack(kind="captions", cues=null,...) > >* for the mp2t cea-708 inband track: >TextTrack(kind="captions", cues=TextTrackCueList,...) >(the TextTrackCues in the TextTrackCueList are of type CEA708Cue) > >* for the mp2t mpeg2-psi inband track >TextTrack(kind="metadata", cues=TextTrackCueList,...) >per spec with a inBandMetadataTrackDispatchType containing the >stream_type and the descriptor bytes >likely accompanied with a label="program description" or something >similar that explains to the user what they will get when they choose >this track >(the TextTrackCues in the TextTrackCueList are generic so just >TextTrackCue objects, but could also be more specific PSICue if the >browser supports such) > >* for the md1 track (because there is no support for the format): >TextTrack(kind="metadata", cues=null,...) > >* for the md2 track: >TextTrack(kind="metadata", cues=TextTrackCueList,...) >(the TextTrackCues in the TextTrackCueList are generic so just >TextTrackCue objects) > >As a JS developer, you can now decide which of the tracks to expose to >the user and could just loop through the video.textTracks list and >remove those tracks that you don't want them to see. E.g. you can >remove all those that have no cues, which still provides the users >with a choice as to whether to see the 708 captions or the WebVTT >captions. You would also parse the cues in the metadata according to >what you know them to be. > >So, this is how it currently goes. In all cases, the JS developer does >not need to know what file format the text track is provided in, >because if the UA can parse it, it will expose it in JS with cues and >if it can't, then it can't expose cues anyway. So, I am not concerned >about the file formats in which text tracks are provided. > >What concerns me, though, is the format of the individual cues. > > >> Using this mechanism, the UA fetches track resources according to what >>track >> media types it supports and what resources are actually resolvable. >> >> Once it has resolved a track's alternate source references to an actual >> resource (whether out-of-band or in-band), the UA determines the actual >> content type of the resource (when it sniffs/parses it). >> >> So, let's say that: >> >> (1) HTMLSourceElement.type (or HTMLTrackElement.type) returns the >>advisory >> (hint) author supplied type (may or may not be the resolved type); and > >The file format type? That's irrelevant as explained above. > > >> (2) HTLTrackElement.track.type returns the actual (sniffed/parsed) type >>as >> determined by the UA and selected by the resource selection algorithm; >> >> Why is this useful? Because it could help the client JS code to >>determine >> things like: >> >> what possible interface types are supported by a cue instance that the >>UA >> constructs for that type; >> >> what possible different formats may be returned from TextTrackCue.text; > >Now you are arguing for cue format types and not file format types. I >agree with providing a hint for these, which is why I suggested making >inBandMetadataTrackDispatchType more generic and calling it cueType >and having browsers expose these where available. > > >> Now, for the case where client JS wants to construct a track, then >> HTMLMediaElement.addTextTrack (possibly renamed to createTextTrack) >>should >> support an optional type parameter which is used to initialize >> TextTrack.type, and subsequently, TextTrack.type is used to constrain >>the >> type(s) of cues constructed by a TextTrack.createCue method or >>constrain the >> type(s) of cues that can be added via TextTrack.addCue. > >s/type/cueType/ and we are basically arguing for the same thing. >Except, my proposal is to set the cueType by the browser to "generic >text", which will be replaced with a more specific cue object (e.g. >WebVTTCue or CEA708Cue) on the first addition of a cue of such type, >after which only cues of that type are allowed to be added. The >cueType is also a hint that the browser can set for metadata tracks if >it knows some more about the content of the track but doesn't have an >actual parser. > >Basically, what I'd like to make possible is: >track = video.addTextTrack('metadata', myLabel', 'en'); >cue0 = new TextTrackCue(0, 5, '{cue: content}'); >track.addCue(cue0); > >and even > >track = video.addTextTrack('subtitles', myLabel', 'en'); >cue0 = new TextTrackCue(0, 5, 'this is a subtitle'); >track.addCue(cue0); > >This is not currently possible because the TextTrackCue constructor is >gone, but I can see these as the use case to add it back. > > >Cheers, >Silvia.
Received on Monday, 13 May 2013 20:07:43 UTC