Re: Moving forward with sync media work from Ingar Mæhlum Arntzen on 2018-10-18 (public-sync-media-pub@w3.org from October 2018)

From: Ingar Mæhlum Arntzen <ingar.arntzen@gmail.com>
Date: Thu, 18 Oct 2018 22:42:09 +0200
To: Marisa DeMeglio <marisa.demeglio@gmail.com>
Cc: rdeltour@gmail.com, public-sync-media-pub@w3.org
Message-ID: <CAOFBLLqq1up2nGYh9qEmf5Sz_ippR87MD1OK0Am--xr+LjORoA@mail.gmail.com>
Thanks Marisa.

I agree that draft syntax with little or no regard for tech is a great
place to start. Starting with tech perspective quite often shifts the focus
to limitations. I guess my point is that with timingsrc there won't be many
limitations :D As long as you can somehow extract id, start-ts and end-ts
for the DOM elements you want to playback (chapters, paragraphs, words) -
all is good.

Quick prototyping with timingsrc should also support incremental
development, where you go a bit back and forth between syntax definition
and evaluation through implementation -- testing alternative designs etc.

Best regards, Ingar








Den tor. 18. okt. 2018 kl. 21:55 skrev Marisa DeMeglio <
marisa.demeglio@gmail.com>:

> Hi Ingar,
>
> It sounds like your work could be very useful to implementors. What we are
> discussing here is not the “how” of processing/playback, but rather the
> “what” - the actual declarative format. So that’s why I keep going on about
> being declarative. I would be curious, when we have some draft syntax and
> examples, how it could map to your playback engine. Looking forward to some
> experimenting!
>
> Thanks
> Marisa
>
> On Oct 18, 2018, at 12:39 PM, Ingar Mæhlum Arntzen <
> ingar.arntzen@gmail.com> wrote:
>
> Hi Marisa and all.
>
> I looked through the requirements again, and I still maintain that the
> timingsrc[1] lib is exactly what you guys need as an engine for playback
> and sync of both audio/video and text progression/navigation. True, it does
> not provide any declarative support, but thats where you come in...
> Timingsrc makes it real easy to define custom data formats and then build
> custom viewers/players with custom navigation primitives etc, and it does
> all the heavy lifting with the timing stuff. Though primitive in
> appearance, this demo page [2] for the sequencer already solves a core part
> of your challenge, ensuring that the right DOM element is activated at the
> right time - relative to playback through text.
>
> If you were to send me an audio file and a timed transcript to go with it,
> (e.g. JSON with start and end timestamps for each word, then putting up a
> rudimentary demo would likely be real quick.
>
> Best, Ingar Arntzen
>
> [1] https://webtiming.github.io/timingsrc/
> [2] https://webtiming.github.io/timingsrc/doc/online_sequencer.html
>
> Den tor. 18. okt. 2018 kl. 21:01 skrev Romain <rdeltour@gmail.com>:
>
>>
>>
>> > On 18 Oct 2018, at 19:37, Marisa DeMeglio <marisa.demeglio@gmail.com>
>> wrote:
>> >
>> >>
>> >> 1. The use cases document says:
>> >>> Text chunks are highlighted in sync with the audio playback, at the
>> authored granularity
>> >>
>> >> This implies that the granularity _is_ authored. Sometimes, the sync
>> could be generated on the fly, with sentence and/or word detection. Do we
>> want to cover this use case too?
>> >
>> > So in this use case, a reading system gets a publication with some
>> coarse level of synchronization (e.g. paragraph), and it provides, on the
>> fly, finer granularities (word or sentence)?
>>
>> Yes, some kind of hybrid approach like that.
>>
>> > Are there tools that do this now? Not necessarily with audio ebooks but
>> with any similar-enough types of content?
>>
>> Sentence/word detection applied to textual content is fairly common with
>> TTS narration, but I don't know of any tool that does this with narrated
>> (or pre-recorded) audio, no.
>> But I could see that being useful, if a reading system with enough
>> processing power implemented it :-)
>>
>> >
>> >>  How would you define/describe testability in our context?
>>
>> I don't know… I think the details depend on the actual technical
>> solution. Ideally a) tests should be runnable in an automated manner b)
>> results should be comparable to reference results in an automated manner.
>>
>> >
>> > To me, validation is a separate concern —  whatever format we produce
>> to represent sync media should be validate-able. Not saying what the
>> validation result should be used for, just that it should be possible to
>> validate.
>>
>> OK!
>>
>> >
>> > To put in context, I’ve gotten several suggestions over the year(s) of
>> “why don’t you just use javascript” to create sync media books, and the
>> answer always is that we want a declarative syntax, one of the reasons why
>> being that it can be validated and migrated forward.
>>
>> Right, I understand.
>> My take is that even a javascript-based approach would need some kind of
>> declaration of a list or structure of audio pointers anyways, so if we
>> standardize that beast with a simple-enough js-friendly format, we can make
>> both worlds happy :-)
>>
>> Romain.
>>
>
>
Received on Thursday, 18 October 2018 20:42:44 UTC