- From: Marisa DeMeglio <marisa.demeglio@gmail.com>
- Date: Wed, 10 Nov 2021 13:04:39 -0800
- To: W3C Synchronized Multimedia for Publications CG <public-sync-media-pub@w3.org>
- Message-Id: <486C04C0-DFCF-48BF-B919-6F0E2416A4E8@gmail.com>
Hi all,
I’ve been experimenting with WebVTT instead of SMIL as a synchronization format for a book with HTML text and audio narration.
Here is a link to a recent prototype I made, showing a book that has been transformed via custom conversion script from EPUB into plain HTML/CSS/JS:
https://daisy.github.io/accessible-books-on-the-web/demos/moby-dick/chapter_001.html <https://daisy.github.io/accessible-books-on-the-web/demos/moby-dick/chapter_001.html>
In it, there’s a WebVTT track attached to an audio element:
<audio src="audio/chapter_001.mp3" controls="" id="abotw-audio”>
<track default="" kind="metadata" src="vtt/chapter_001.vtt">
</audio>
And because this is a metadata track, the VTT file’s contents aren’t displayed as captions, just delivered as payload to the cue event handlers. One example of a cue in the VTT file is:
1
00:00:00.000 --> 00:00:04.833
{
"action”: {
"name”: "addCssClass”,
"data”: "sync-highlight”
},
"selector”: {
"type”: "FragmentSelector”,
"value”: “c01h01"
}
}
Comparing this approach to what we’ve been considering already (which is to extend SMIL [1]), I notice the following:
- Requirements on the audio files become more strict with WebVTT. There’s no way to say (without a chunk of custom scripting) that you want to play 10s from audio-1.mp3 and then 20s from audio-2.mp3 and then back to audio-1. You just play a file, start to end (or media fragment offset to media fragment offset).
- There are no structuring options for WebVTT, so any structural navigation (e.g. “escapability”, which is exiting narration of complex structures and returning to the main content flow) becomes entirely DOM-based with no parallel conveniences in the audio narration layer. I don’t think this is necessarily a negative thing.
- Implementation of WebVTT-based highlighting using the TextTrack API is very easy, vs SMIL.
- Unlike SyncMedia, WebVTT is not a drop-in replacement for Media Overlays. At least not without some packaging rules.
Anyway, just wanted to share. Discussion welcome!
Marisa
1. https://w3c.github.io/sync-media-pub/sync-media.html <https://w3c.github.io/sync-media-pub/sync-media.html>
Received on Wednesday, 10 November 2021 21:05:55 UTC