- From: Marisa DeMeglio <marisa.demeglio@gmail.com>
- Date: Thu, 11 Nov 2021 10:17:43 -0800
- To: MURATA Makoto <eb2m-mrt@asahi-net.or.jp>
- Cc: W3C Synchronized Multimedia for Publications CG <public-sync-media-pub@w3.org>
- Message-Id: <63182343-6F1B-4C83-A0AD-992970C54D54@gmail.com>
Hi Makoto,
Thanks for having a look!
So idx.json and data.json aren’t related to synchronized playback - they’re for for full-book searching via fuse.js [1]. They could be generated on the fly but that can be slow so I do it ahead of time. The search index is basically a list of text nodes' content and corresponding selectors, but doesn’t consider text phrases across boundaries.
So while this would match either “Hello” or “How are you”:
<span>Hello</span><span>how are you?</span>
It would not match “Hello how are you?”.
However, this is just a consequence of the simple approach used in this prototype. I think one could create a more sophisticated index that takes this into account.
Marisa
1. https://fusejs.io/
> On Nov 10, 2021, at 17:29, MURATA Makoto <eb2m-mrt@asahi-net.or.jp> wrote:
>
> Marisa,
>
> I tried your document with interest. How are idx.json and data.json
> used? Will they be created from the HTML source on the fly?
> I am wondering if your approach works for documents containing
> ruby or BIDI.
>
> Regards,
> Makoto
>
> 2021年11月11日(木) 6:04 Marisa DeMeglio <marisa.demeglio@gmail.com <mailto:marisa.demeglio@gmail.com>>:
> Hi all,
>
> I’ve been experimenting with WebVTT instead of SMIL as a synchronization format for a book with HTML text and audio narration.
>
> Here is a link to a recent prototype I made, showing a book that has been transformed via custom conversion script from EPUB into plain HTML/CSS/JS:
> https://daisy.github.io/accessible-books-on-the-web/demos/moby-dick/chapter_001.html <https://daisy.github.io/accessible-books-on-the-web/demos/moby-dick/chapter_001.html>
>
> In it, there’s a WebVTT track attached to an audio element:
>
> <audio src="audio/chapter_001.mp3" controls="" id="abotw-audio”>
> <track default="" kind="metadata" src="vtt/chapter_001.vtt">
> </audio>
>
> And because this is a metadata track, the VTT file’s contents aren’t displayed as captions, just delivered as payload to the cue event handlers. One example of a cue in the VTT file is:
>
> 1
> 00:00:00.000 --> 00:00:04.833
> {
> "action”: {
> "name”: "addCssClass”,
> "data”: "sync-highlight”
> },
> "selector”: {
> "type”: "FragmentSelector”,
> "value”: “c01h01"
> }
> }
>
>
> Comparing this approach to what we’ve been considering already (which is to extend SMIL [1]), I notice the following:
>
> - Requirements on the audio files become more strict with WebVTT. There’s no way to say (without a chunk of custom scripting) that you want to play 10s from audio-1.mp3 and then 20s from audio-2.mp3 and then back to audio-1. You just play a file, start to end (or media fragment offset to media fragment offset).
>
> - There are no structuring options for WebVTT, so any structural navigation (e.g. “escapability”, which is exiting narration of complex structures and returning to the main content flow) becomes entirely DOM-based with no parallel conveniences in the audio narration layer. I don’t think this is necessarily a negative thing.
>
> - Implementation of WebVTT-based highlighting using the TextTrack API is very easy, vs SMIL.
>
> - Unlike SyncMedia, WebVTT is not a drop-in replacement for Media Overlays. At least not without some packaging rules.
>
> Anyway, just wanted to share. Discussion welcome!
>
> Marisa
>
> 1. https://w3c.github.io/sync-media-pub/sync-media.html <https://w3c.github.io/sync-media-pub/sync-media.html>
>
>
>
>
> --
> Regards,
> Makoto
Received on Thursday, 11 November 2021 18:18:00 UTC