- From: Marisa DeMeglio <marisa.demeglio@gmail.com>
- Date: Thu, 11 Nov 2021 10:17:43 -0800
- To: MURATA Makoto <eb2m-mrt@asahi-net.or.jp>
- Cc: W3C Synchronized Multimedia for Publications CG <public-sync-media-pub@w3.org>
- Message-Id: <63182343-6F1B-4C83-A0AD-992970C54D54@gmail.com>
Hi Makoto, Thanks for having a look! So idx.json and data.json aren’t related to synchronized playback - they’re for for full-book searching via fuse.js [1]. They could be generated on the fly but that can be slow so I do it ahead of time. The search index is basically a list of text nodes' content and corresponding selectors, but doesn’t consider text phrases across boundaries. So while this would match either “Hello” or “How are you”: <span>Hello</span><span>how are you?</span> It would not match “Hello how are you?”. However, this is just a consequence of the simple approach used in this prototype. I think one could create a more sophisticated index that takes this into account. Marisa 1. https://fusejs.io/ > On Nov 10, 2021, at 17:29, MURATA Makoto <eb2m-mrt@asahi-net.or.jp> wrote: > > Marisa, > > I tried your document with interest. How are idx.json and data.json > used? Will they be created from the HTML source on the fly? > I am wondering if your approach works for documents containing > ruby or BIDI. > > Regards, > Makoto > > 2021年11月11日(木) 6:04 Marisa DeMeglio <marisa.demeglio@gmail.com <mailto:marisa.demeglio@gmail.com>>: > Hi all, > > I’ve been experimenting with WebVTT instead of SMIL as a synchronization format for a book with HTML text and audio narration. > > Here is a link to a recent prototype I made, showing a book that has been transformed via custom conversion script from EPUB into plain HTML/CSS/JS: > https://daisy.github.io/accessible-books-on-the-web/demos/moby-dick/chapter_001.html <https://daisy.github.io/accessible-books-on-the-web/demos/moby-dick/chapter_001.html> > > In it, there’s a WebVTT track attached to an audio element: > > <audio src="audio/chapter_001.mp3" controls="" id="abotw-audio”> > <track default="" kind="metadata" src="vtt/chapter_001.vtt"> > </audio> > > And because this is a metadata track, the VTT file’s contents aren’t displayed as captions, just delivered as payload to the cue event handlers. One example of a cue in the VTT file is: > > 1 > 00:00:00.000 --> 00:00:04.833 > { > "action”: { > "name”: "addCssClass”, > "data”: "sync-highlight” > }, > "selector”: { > "type”: "FragmentSelector”, > "value”: “c01h01" > } > } > > > Comparing this approach to what we’ve been considering already (which is to extend SMIL [1]), I notice the following: > > - Requirements on the audio files become more strict with WebVTT. There’s no way to say (without a chunk of custom scripting) that you want to play 10s from audio-1.mp3 and then 20s from audio-2.mp3 and then back to audio-1. You just play a file, start to end (or media fragment offset to media fragment offset). > > - There are no structuring options for WebVTT, so any structural navigation (e.g. “escapability”, which is exiting narration of complex structures and returning to the main content flow) becomes entirely DOM-based with no parallel conveniences in the audio narration layer. I don’t think this is necessarily a negative thing. > > - Implementation of WebVTT-based highlighting using the TextTrack API is very easy, vs SMIL. > > - Unlike SyncMedia, WebVTT is not a drop-in replacement for Media Overlays. At least not without some packaging rules. > > Anyway, just wanted to share. Discussion welcome! > > Marisa > > 1. https://w3c.github.io/sync-media-pub/sync-media.html <https://w3c.github.io/sync-media-pub/sync-media.html> > > > > > -- > Regards, > Makoto
Received on Thursday, 11 November 2021 18:18:00 UTC