- From: Silvia Pfeiffer <silviapfeiffer1@gmail.com>
- Date: Mon, 13 May 2013 12:47:47 +1000
- To: Glenn Adams <glenn@skynav.com>
- Cc: Simon Pieters <simonp@opera.com>, Bob Lund <B.Lund@cablelabs.com>, public-html <public-html@w3.org>, "Jerry Smith, (WINDOWS)" <jdsmith@microsoft.com>, "Mark Vickers @ Comcast" <mark_vickers@cable.comcast.com>
On Mon, May 13, 2013 at 5:00 AM, Glenn Adams <glenn@skynav.com> wrote: > > First, I'm talking about the Media Type of a text track resource here, not a specific @kind (usage) of a text track resource. For example, "text/vtt", > "application/ttml+xml", "application/x-mpeg2-psi" [I just made that up], etc. OK, this is taking the discussion into a completely different and unrelated direction, because we were discussing TextTrackCue and not TextTrack types. Also, the changes you are proposing below are not possible because <track> is an empty element and we are not going to break backwards compatibility on the markup. But I'll entertain the discussion of the use cases that they imply rather than the particular specification proposal. I'm still curious about the one question I had before: are you or anyone else aware of any implementations of the inBandMetadataTrackDispatchType attribute? Since it's not even used in http://www.cablelabs.com/specifications/CL-SP-HTML5-MAP-I02-120510.pdf but instead @label is used, I don't know if it's satisfying its use case. > Now, let me try to be more concrete regarding uses: > > Ideally, <track> would use <source> in the same fashion as <video> and > <audio>, in order to allow use of the resource selection algorithm for > alternate track resources: > > <video src="video.mp2t"> > <!-- in- or out-of-band captions, three alternative sources --> > <track kind="captions"> > <!-- out-of-band VTT --> > <source src="video.vtt" type="text/vtt"> > <!-- out-of-band TTML --> > <source src="video.ttml" type="application/ttml+xml"> > <!-- in-band 708 --> > <source src="video.mp2t" type="application/x-cea-708"> > </track> > <!-- in-band MPEG-2 PSI, only one source --> > <track kind="metadata" src="video.mp2t" type="application/x-mpeg2-psi" /> > <!-- out-of-band custom metadata, two alternative sources --> > <track kind="metadata"> > <!-- out-of-band custom metadata, type 1 --> > <source src="video.md1" type="application/x-metadata-1"> > <!-- out-of-band custom metadata, type 2, in case type 1 not supported > --> > <source src="video.md2" type="application/x-metadata-2"> > </track> > </video> This is overly complicated and not necessary to do in markup, because you get all of this in the JavaScript TextTrack API. Plus you do not have to deal with special cases like the inband text tracks - markup is the wrong approach for this. As for the suggestion of doing <source> inside <track> - that is not necessary, because all supported track formats are exposed in a track list to JS or even the user - this is contrary to <video> where only a single @src is always active. You can, however, achieve all the use cases that you are trying to emulate in the complex markup above in JS right now. Here's how it's done with the current spec: Markup: <video src="video.mp2t"> <track kind="captions" src="video.vtt"> <track kind="captions" src="video.ttml"> <track kind="metadata" src="video.md1"> <track kind="metadata" src="video.md2"> </video> JavaScript: Assuming the browser can parse the following file formats: * mp2t video file * VTT file * mp2t cea-708 inband track * mp2t mpeg2-psi inband track * md2 metadata file But is unable to parse: * TTML file * md1 metadata file The following objects are available in JavaScript: * for the WebVTT track: TextTrack(kind="captions", cues=TextTrackCueList,...) (the TextTrackCues in the TextTrackCueList are of type WebVTTCue) * for the TTML track (because there is no support for the format): TextTrack(kind="captions", cues=null,...) * for the mp2t cea-708 inband track: TextTrack(kind="captions", cues=TextTrackCueList,...) (the TextTrackCues in the TextTrackCueList are of type CEA708Cue) * for the mp2t mpeg2-psi inband track TextTrack(kind="metadata", cues=TextTrackCueList,...) per spec with a inBandMetadataTrackDispatchType containing the stream_type and the descriptor bytes likely accompanied with a label="program description" or something similar that explains to the user what they will get when they choose this track (the TextTrackCues in the TextTrackCueList are generic so just TextTrackCue objects, but could also be more specific PSICue if the browser supports such) * for the md1 track (because there is no support for the format): TextTrack(kind="metadata", cues=null,...) * for the md2 track: TextTrack(kind="metadata", cues=TextTrackCueList,...) (the TextTrackCues in the TextTrackCueList are generic so just TextTrackCue objects) As a JS developer, you can now decide which of the tracks to expose to the user and could just loop through the video.textTracks list and remove those tracks that you don't want them to see. E.g. you can remove all those that have no cues, which still provides the users with a choice as to whether to see the 708 captions or the WebVTT captions. You would also parse the cues in the metadata according to what you know them to be. So, this is how it currently goes. In all cases, the JS developer does not need to know what file format the text track is provided in, because if the UA can parse it, it will expose it in JS with cues and if it can't, then it can't expose cues anyway. So, I am not concerned about the file formats in which text tracks are provided. What concerns me, though, is the format of the individual cues. > Using this mechanism, the UA fetches track resources according to what track > media types it supports and what resources are actually resolvable. > > Once it has resolved a track's alternate source references to an actual > resource (whether out-of-band or in-band), the UA determines the actual > content type of the resource (when it sniffs/parses it). > > So, let's say that: > > (1) HTMLSourceElement.type (or HTMLTrackElement.type) returns the advisory > (hint) author supplied type (may or may not be the resolved type); and The file format type? That's irrelevant as explained above. > (2) HTLTrackElement.track.type returns the actual (sniffed/parsed) type as > determined by the UA and selected by the resource selection algorithm; > > Why is this useful? Because it could help the client JS code to determine > things like: > > what possible interface types are supported by a cue instance that the UA > constructs for that type; > > what possible different formats may be returned from TextTrackCue.text; Now you are arguing for cue format types and not file format types. I agree with providing a hint for these, which is why I suggested making inBandMetadataTrackDispatchType more generic and calling it cueType and having browsers expose these where available. > Now, for the case where client JS wants to construct a track, then > HTMLMediaElement.addTextTrack (possibly renamed to createTextTrack) should > support an optional type parameter which is used to initialize > TextTrack.type, and subsequently, TextTrack.type is used to constrain the > type(s) of cues constructed by a TextTrack.createCue method or constrain the > type(s) of cues that can be added via TextTrack.addCue. s/type/cueType/ and we are basically arguing for the same thing. Except, my proposal is to set the cueType by the browser to "generic text", which will be replaced with a more specific cue object (e.g. WebVTTCue or CEA708Cue) on the first addition of a cue of such type, after which only cues of that type are allowed to be added. The cueType is also a hint that the browser can set for metadata tracks if it knows some more about the content of the track but doesn't have an actual parser. Basically, what I'd like to make possible is: track = video.addTextTrack('metadata', myLabel', 'en'); cue0 = new TextTrackCue(0, 5, '{cue: content}'); track.addCue(cue0); and even track = video.addTextTrack('subtitles', myLabel', 'en'); cue0 = new TextTrackCue(0, 5, 'this is a subtitle'); track.addCue(cue0); This is not currently possible because the TextTrackCue constructor is gone, but I can see these as the use case to add it back. Cheers, Silvia.
Received on Monday, 13 May 2013 02:48:35 UTC