- From: MURATA Makoto <eb2m-mrt@asahi-net.or.jp>
- Date: Thu, 11 Nov 2021 10:29:13 +0900
- To: W3C Synchronized Multimedia for Publications CG <public-sync-media-pub@w3.org>
- Message-ID: <CALvn5EBLXT3nXGKJ2ajsP80P3FaWY8qRU=ooyi3vR-=hefYTNQ@mail.gmail.com>
Marisa, I tried your document with interest. How are idx.json and data.json used? Will they be created from the HTML source on the fly? I am wondering if your approach works for documents containing ruby or BIDI. Regards, Makoto 2021年11月11日(木) 6:04 Marisa DeMeglio <marisa.demeglio@gmail.com>: > Hi all, > > I’ve been experimenting with WebVTT instead of SMIL as a synchronization > format for a book with HTML text and audio narration. > > Here is a link to a recent prototype I made, showing a book that has been > transformed via custom conversion script from EPUB into plain HTML/CSS/JS: > > https://daisy.github.io/accessible-books-on-the-web/demos/moby-dick/chapter_001.html > > In it, there’s a WebVTT track attached to an audio element: > > <audio src="audio/chapter_001.mp3" controls="" id="abotw-audio”> > <track default="" kind="metadata" src="vtt/chapter_001.vtt"> > </audio> > > And because this is a metadata track, the VTT file’s contents aren’t > displayed as captions, just delivered as payload to the cue event handlers. > One example of a cue in the VTT file is: > > 1 > 00:00:00.000 --> 00:00:04.833 > { > "action”: { > "name”: "addCssClass”, > "data”: "sync-highlight” > }, > "selector”: { > "type”: "FragmentSelector”, > "value”: “c01h01" > } > } > > > Comparing this approach to what we’ve been considering already (which is > to extend SMIL [1]), I notice the following: > > - Requirements on the audio files become more strict with WebVTT. There’s > no way to say (without a chunk of custom scripting) that you want to play > 10s from audio-1.mp3 and then 20s from audio-2.mp3 and then back to > audio-1. You just play a file, start to end (or media fragment offset to > media fragment offset). > > - There are no structuring options for WebVTT, so any structural > navigation (e.g. “escapability”, which is exiting narration of complex > structures and returning to the main content flow) becomes entirely > DOM-based with no parallel conveniences in the audio narration layer. I > don’t think this is necessarily a negative thing. > > - Implementation of WebVTT-based highlighting using the TextTrack API is > very easy, vs SMIL. > > - Unlike SyncMedia, WebVTT is not a drop-in replacement for Media > Overlays. At least not without some packaging rules. > > Anyway, just wanted to share. Discussion welcome! > > Marisa > > 1. https://w3c.github.io/sync-media-pub/sync-media.html > > > -- Regards, Makoto
Received on Thursday, 11 November 2021 01:30:05 UTC