- From: Ingar Mæhlum Arntzen <ingar.arntzen@gmail.com>
- Date: Tue, 13 Mar 2018 19:52:15 +0100
- To: Marisa DeMeglio <marisa.demeglio@gmail.com>
- Cc: Daniel Weck <daniel.weck@gmail.com>, public-sync-media-pub@w3.org, public-webtiming@w3.org
- Message-ID: <CAOFBLLoPWBPAzr+R1as767zgD1yKavVhVmuMhHf+=RsYGoCj1Q@mail.gmail.com>
Hi Marisa 2018-03-12 19:53 GMT+01:00 Marisa DeMeglio <marisa.demeglio@gmail.com>: > Thanks for reaching out and for the links to your work! > > In addition to these excellent questions from Daniel, I am wondering about > web browser support (or anticipated browser support) — what do you expect? > Short answer: This already works. The timingsrc programming model [1] already has the most vital tools. Use is not dependent on standardization. Longer answer. There are still some issues when requirements for precision are very strict. Say echoless audio playback from a group of smart phones. Humans are sensitive to sync errors down to about 6-7 milliseconds. The synchronization precision you get when synchronizing HTML5 audio/video now is also about 6-7 milliseconds. This means that we typically get echoless playback on good devices, but a slight echo on cheaper/older ones. Standardization would mean that browser vendors could at least fix some of the obvious weaknesses of media elements with regards to synchronization. So, universal support for echoless playback depends on standardization. > What is the relationship, if any, between Multi-device timing and TTML? > Are the APIs overlapping or complementary (or “it’s complicated”)? > If you mean TTML as a data format, there is no overlap. The solutions we are advocating in the multi-device timing CG are concerned with mechanism for timing, synchronization, media control/playback. Timing concepts have traditionally been mixed with data formats (and delivery methods) (e.g. SMIL). In contrast, one of the principal design goals for us has been to maintain a clear separation of timing and data (media content). The benefit of this is that timing solutions can be used across very different data formats, delivery methods and across different application domains. This flexibility may also reduce our dependence on standardized formats, such as TTML. If you mean TTML API [2], there is much overlap. TTML API seems to be an integration between the TTML data format and the texttrack mechanism of HTML5 media elements. The timing object model supports the same capability, using the Sequencer [3], which is analogous to the texttrack mechanism of HTML5 media elements. There are some important differences - the sequencer improves on a number of weaknesses of the texttrackmechanism - the sequencer may be used with any data format for timed cues, not only TTML - the sequencer may be used without necessarily requiring a media element - the sequencer does not do any rendering of the cues, this is entirely up the application - the sequencer is open for multi-device synchronization (via the timing object) > In EPUB3, we use SMIL to represent media synchronization, which gives us > a declarative syntax, but no API. Ideally for web publications, we’d have > both. > > As the timing object model does not dictate any changes in data formats or delivery mechanisms, it is typically easy to integrate with other frameworks. As I mentioned in my respons to Daniel, if you want to integrate SMIL with the timing object model, that should not be difficult. [1] https://webtiming.github.io/timingsrc/ [2] https://dvcs.w3.org/hg/ttml/raw-file/default/ttml2-api/Overview.html [3] https://webtiming.github.io/timingsrc/doc/background_sequencer.html Hope this was helpful :) Best regards, Ingar Marisa > > > On Mar 12, 2018, at 11:03 AM, Daniel Weck <daniel.weck@gmail.com> wrote: > > Thank you for your input Ingar (I assume this is your firstname?) > > The "timing object" certainly looks like a useful and powerful API. > If I am not mistaken this proposal focuses mainly on programmatic usage? > (Javascript) > > If so, do you envision some kind of declarative syntax that would allow > content creators (web and digital publishing) to encode a "static" / > persistent representation of synchronized multi-media streams? > For example EPUB3 "read aloud" / "talking books" are currently authored > using the Media Overlays flavour of SMIL (XML), and long-form synchronized > text+audio content is typically generated via some kind of semi-automated > production process. > > I am thinking specifically about: (1) an HTML document, (2) a separate > audio file representing the pre-recorded human narration of the HTML > document, and (3) some kind of meta-structure / declarative syntax that > would define the synchronization "points" between HTML elements and audio > time ranges. > Note that most existing "talking book" implementations render such > combined text/audio streams by "highlighting" / emphasizing individual HTML > fragments as they are being narrated (using CSS styles), but the same > declarative expression could be rendered with a karaoke-like layout, etc. > Of course, there are also other important use-cases such as video+text, > video+audio, etc., but I just wanted to pick your brain about a concrete > use-case in digital publishing / EPUB3 e-books :) > > Cheers, and thanks! > Daniel > > > > > > On 11 March 2018 at 21:34, Ingar Mæhlum Arntzen <ingar.arntzen@gmail.com> > wrote: > >> Hi Marisa >> >> Chris Needham of the Media & Enternainment IG made me aware of the CG >> your setting up. >> >> This is a welcome initiative, and it is great to see more people >> expressing the need for better sync support on the Web ! >> >> I'm the chair of Multi-device Timing CG [2], so I thought I'd say a few >> words about that as it seems we have similar objectives. Basically, the >> scope of the Multi-device Timing CG is a broad one; synchronization of >> anything with anything on the Web, whether it is text synced with A/V >> within a single document, or across multiple devices. We have also proposed >> a full solution to this problem for standardization, with the timing object >> [3] being the central concept. I did have a look at the requirements >> document [4] you linked to, and it seems to me the timing object (and the >> other tools we have made available [5]) should be a good basis for >> addressing your challenges. For instance, a karaoke-style text presentation >> synchronized with audio should be quite easy to put together using these >> tools. >> >> If you have some questions about the model we are proposing, and how it >> may apply to your use cases, please send them our way :) >> >> Best regards, >> >> Ingar Arntzen >> >> [1] https://lists.w3.org/Archives/Public/public-sync-media-pub/2 >> 018Feb/0000.html >> [2] https://www.w3.org/community/webtiming/ >> [3] http://webtiming.github.io/timingobject/ >> [4] https://github.com/w3c/publ-wg/wiki/Requirements-and-design- >> options-for-synchronized-multimedia >> [5] https://webtiming.github.io/timingsrc/ >> >> > >
Received on Tuesday, 13 March 2018 18:52:40 UTC