- From: Ingar Mæhlum Arntzen <ingar.arntzen@gmail.com>
- Date: Thu, 18 Oct 2018 21:39:17 +0200
- To: rdeltour@gmail.com
- Cc: Marisa DeMeglio <marisa.demeglio@gmail.com>, public-sync-media-pub@w3.org
- Message-ID: <CAOFBLLpqNY0+_t=baANWrMO-7a+EOh9cf4oWbGcFTqrU+Ho_hA@mail.gmail.com>
Hi Marisa and all. I looked through the requirements again, and I still maintain that the timingsrc[1] lib is exactly what you guys need as an engine for playback and sync of both audio/video and text progression/navigation. True, it does not provide any declarative support, but thats where you come in... Timingsrc makes it real easy to define custom data formats and then build custom viewers/players with custom navigation primitives etc, and it does all the heavy lifting with the timing stuff. Though primitive in appearance, this demo page [2] for the sequencer already solves a core part of your challenge, ensuring that the right DOM element is activated at the right time - relative to playback through text. If you were to send me an audio file and a timed transcript to go with it, (e.g. JSON with start and end timestamps for each word, then putting up a rudimentary demo would likely be real quick. Best, Ingar Arntzen [1] https://webtiming.github.io/timingsrc/ [2] https://webtiming.github.io/timingsrc/doc/online_sequencer.html Den tor. 18. okt. 2018 kl. 21:01 skrev Romain <rdeltour@gmail.com>: > > > > On 18 Oct 2018, at 19:37, Marisa DeMeglio <marisa.demeglio@gmail.com> > wrote: > > > >> > >> 1. The use cases document says: > >>> Text chunks are highlighted in sync with the audio playback, at the > authored granularity > >> > >> This implies that the granularity _is_ authored. Sometimes, the sync > could be generated on the fly, with sentence and/or word detection. Do we > want to cover this use case too? > > > > So in this use case, a reading system gets a publication with some > coarse level of synchronization (e.g. paragraph), and it provides, on the > fly, finer granularities (word or sentence)? > > Yes, some kind of hybrid approach like that. > > > Are there tools that do this now? Not necessarily with audio ebooks but > with any similar-enough types of content? > > Sentence/word detection applied to textual content is fairly common with > TTS narration, but I don't know of any tool that does this with narrated > (or pre-recorded) audio, no. > But I could see that being useful, if a reading system with enough > processing power implemented it :-) > > > > >> How would you define/describe testability in our context? > > I don't know… I think the details depend on the actual technical solution. > Ideally a) tests should be runnable in an automated manner b) results > should be comparable to reference results in an automated manner. > > > > > To me, validation is a separate concern — whatever format we produce to > represent sync media should be validate-able. Not saying what the > validation result should be used for, just that it should be possible to > validate. > > OK! > > > > > To put in context, I’ve gotten several suggestions over the year(s) of > “why don’t you just use javascript” to create sync media books, and the > answer always is that we want a declarative syntax, one of the reasons why > being that it can be validated and migrated forward. > > Right, I understand. > My take is that even a javascript-based approach would need some kind of > declaration of a list or structure of audio pointers anyways, so if we > standardize that beast with a simple-enough js-friendly format, we can make > both worlds happy :-) > > Romain. >
Received on Thursday, 18 October 2018 19:39:52 UTC