- From: Cyril Concolato <cyril.concolato@telecom-paristech.fr>
- Date: Wed, 04 Sep 2013 17:03:14 +0200
- To: public-html@w3.org
Hi Silvia, It is a bit hard to follow this long discussion spread on this list, the blink-dev list, the bug tracker, ... I'll give my understanding in the hope that it helps and that it won't add more confusion. My understanding is that we should distinguish the process which generates cues from the process that consumes the cues and draft the interface(s) with both processes in mind. There are 2 ways to generate cue objects: A. created by some JS code The content of the cue may be generated client-side or received from XHR. The format of the cue content may be anything: plain text, xml, binary data, base64 encoded or not. The data has at least a start time (possibly an end time) and should have an associated MIME type. Then you have 2 sub-cases: A.1 The browser is capable of creating specific objects from the cue content following the MIME type (e.g. WebVTT Node objects, TTML objects, ...). In that case, there should be a way (for instance a dedicated interface) for a JS app to have the cue content parsed and have the objects created by the browser: i.e. if the content type of the cue I want to generate is text/CueFormatX, I will check if the browser supports the parsing of the CueFormatX, and call the parsing (via a constructor or another method) to get a specialized object and then access CueFormatX.propertyY if needed. A.2 The browser is not capable of creating specific objects from the cue content (e.g. proprietary binary data) or the MIME type is unknown, the JS can use a generic constructor or method to store the timed cue content for later use. B. created by the browser The content of the cues is generated and received, outside of a JS processing, from resources in a format that is understood by the browser (e.g. plain WebVTT files, TTML files, MP4 files, MPEG-2 TS, WebM, ...). Same as above, the browser will generate cue objects, ideally as much specialized as possible: i.e. if the resources is of type text/vtt, it should create VTTCue; or similar for text/CueFormatX. Then, there are 2 ways to consume the cue objects: C. The browser is capable of producing a renderable representation of the cue content (e.g. ideally there is a method (or equivalent) isRenderableTextTrack(mime) which returns true), then: C.1 If the rendering is left to the browser natively, the track kind is set to subtitles or captions. C.2 If the rendering needs to be altered by the JS, the track kind is set to metadata, the JS code calls getCueAsHTML when needed, the result is modified and displayed. D. The browser is not capable of producing a renderable representation of the cue content The JS code should handle the rendering of the cue content from the given cue objects (specialized or not) Of course, you could mix how the cues are received with how they are rendered and have: - B+C (e.g. the browser supports parsing of WebVTT into cue nodes and the rendering) - or B+D (receiving an unknown track from an MP4 file (e.g. 3GPP Timed Text) and have JS conversion to WebVTT cues), - or A.1+C - or A.1+D - or A.2+D I don't see use cases for A.2+C: if a browser is not capable of creating specialized objects for a format it is probably not capable of rendering the cue. I don't have a clear opinion on which design is the best (new cue interfaces with/without constructor, methods on the texttrack interface, ...), but I would like to have all use cases possible. Is it the case with the W3C approach? with the WhatWG approach? Could we compared example codes? HTH, Cyril Le 31/08/2013 09:26, Silvia Pfeiffer a écrit : > Hi all, > > Recent changes to the TextTrackCue interface had led to a fork with > the WHATWG spec [1] when resolving bug 21851 [2]. > > This caused extensive discussion on blink-dev [3] when an intent to > implement was proposed. > > In the W3C WG we recognize the need for a generic cue interface type > with a constructor and a text attribute. It allows browsers to expose > cues in text tracks of video or audio files for which browsers don't > intend to implement parsers. It also allows JavaScript developers to > create time-synchronized data for media elements in any format they > require. > > The discussion on blink-dev exposed that the currently specified > solution of bug 21851 [2] in the HTML5 spec is flawed in several ways: > > (1) TextTrackCue objects that are not fully abstract create hard to > debug issues of backwards compatibility due to existing code that > assumes "new TextTrackCue()" constructs a cue with VTT semantics; > (2) in order to transition old TextTrackCue interface usage to "new > VTTCue()", it is better to remove the existing TextTrackCue > constructor causing hard failure (easily recognizable) instead of soft > failure (more difficult to recognize); > (3) the abstract TextTrackCue interface of the WHATWG is desirable for > extensibility to non-text-based cue interfaces of the future; > (4) the interface fork between the WHATWG and W3C spec should be removed. > > An alternative resolution to bug 21851 [2] has previously been > proposed and discussed: create a new interface that has the text > attribute and the constructor and inherits from the abstract > interface. > > This will result in the following interfaces: > > interface TextTrackCue : EventTarget { > readonly attribute TextTrack? track; > > attribute DOMString id; > attribute double startTime; > attribute double endTime; > attribute boolean pauseOnExit; > > attribute EventHandler onenter; > attribute EventHandler onexit; > }; > > [Constructor(double startTime, double endTime, DOMString text)] > interface GenericCue : TextTrackCue { > attribute DOMString text; > }; > > Whether VTTCue will inherit from GenericCue or from TextTrackCue will > be resolved in the TextTrack CG once this change has been applied to > the HTML5 spec. > > It is my understanding that this proposed change resolves all the > above listed issues. I will therefore apply these changes next week > unless there are any further concerns. > > Regards, > Silvia (as HTML spec editor). > > [1] https://www.w3.org/Bugs/Public/show_bug.cgi?id=22903 > [2] https://www.w3.org/Bugs/Public/show_bug.cgi?id=21851 > [3] https://groups.google.com/a/chromium.org/d/msg/blink-dev/-VHGnuNNUxM/Yibbv2TgDoYJ > -- Cyril Concolato Maître de Conférences/Associate Professor Groupe Multimedia/Multimedia Group Telecom ParisTech 46 rue Barrault 75 013 Paris, France http://concolato.wp.mines-telecom.fr/
Received on Wednesday, 4 September 2013 15:03:28 UTC