Re: Synchronized narration work from the Sync Media Pub CG from Nigel Megitt on 2019-06-12 (public-audio-description@w3.org from June 2019)

From: Nigel Megitt <nigel.megitt@bbc.co.uk>
Date: Wed, 12 Jun 2019 11:56:53 +0000
To: "Charles 'chaals' (McCathie) Nevile" <chaals@yandex.ru>, "public-web-and-tv@w3.org" <public-web-and-tv@w3.org>, "public-audio-description@w3.org" <public-audio-description@w3.org>, "Marisa DeMeglio" <marisa.demeglio@gmail.com>
CC: Daniel Weck <daniel.weck@gmail.com>
Message-ID: <D926A11F.46064%nigel.megitt@bbc.co.uk>
Hi,

This is interesting work, and I’m particularly interested in the context of work we’re doing in the Audio Description Community Group, which is related but different.

Perhaps in answer to Chaals’s question, there’s a technical difference here that feels like we ought to be able to work around, but is actually quite fundamental to the way that WebVTT and TTML work, for example, but not this use case.

Whereas timed text formats are predicated on media times, the requirement for this application, if I’ve understood it correctly, is to link the audio rendition to element content. This is why it works that you can click on some text and hear the audio for that directly. The text isn’t acting as a play-head-position link, i.e. “on click, move play head to 13.2s”, but there is a mode of operation where you hear the consecutive elements’ linked audio being played consecutively, as if it is continuous media, with a link back to the highlighting of each separate snippet of text.

I’ve seen plenty of examples of timed transcripts of videos where the text at the play head time is highlighted and the user can click on any random place in the transcript to make the play head jump there, but I think there’s a semantic mismatch between that experience and this one – here the text content is the primary thing, not the video/audio.

I recall a use case being discussed previously as well where, on a paged view of the text, a sound effect can be played each time there’s a page turn. This approach is amenable to that, assuming that the same fragment id can be reused on different elements in the source document.

I’d love to see a single approach that could make both use cases work, but I’m not sure what it would look like. SMIL probably got somewhere very close, using the event based model, where events could be generated by time on a media playback timeline or via some other API that fires them, but I sense that the willingness to implement all of the complexity involved in SMIL may be dwindling, and in any case, it doesn’t by itself resolve the problem of how to express timing and element content event triggering in a timed text document format.

Kind regards,

Nigel


From: "Charles 'chaals' (McCathie) Nevile" <chaals@yandex.ru<mailto:chaals@yandex.ru>>
Date: Wednesday, 12 June 2019 at 09:58
To: "public-web-and-tv@w3.org<mailto:public-web-and-tv@w3.org>" <public-web-and-tv@w3.org<mailto:public-web-and-tv@w3.org>>, "public-audio-description@w3.org<mailto:public-audio-description@w3.org>" <public-audio-description@w3.org<mailto:public-audio-description@w3.org>>, Marisa DeMeglio <marisa.demeglio@gmail.com<mailto:marisa.demeglio@gmail.com>>
Cc: Daniel Weck <daniel.weck@gmail.com<mailto:daniel.weck@gmail.com>>
Subject: Re: Synchronized narration work from the Sync Media Pub CG
Resent-From: <public-audio-description@w3.org<mailto:public-audio-description@w3.org>>
Resent-Date: Wednesday, 12 June 2019 at 09:59

Hi,

Thanks for the pointer.

I'm curious why you wouldn't use e.g. WebVTT or another existing markup that carries the associations between text and audio renderings.

cheers

Chaals

On Wed, 12 Jun 2019 02:12:26 +0200, Marisa DeMeglio <marisa.demeglio@gmail.com<mailto:marisa.demeglio@gmail.com>> wrote:

Hi all,

Chris Needham suggested that I share with you what we’ve been working on in the Synchronized Media for Publications CG - it’s a lightweight JSON format for representing pre-recorded narration synchronized with HTML, to provide an accessible reading experience. The primary use case is in web publications (we are involved in the Publishing WG), but it has been designed to live as a standalone “overlay” for HTML documents. Below are some links to the latest drafts.

And, just to give you an idea of the basic user experience, here are two slightly different proof of concept demos for the sample in our repository:

- https://raw.githack.com/w3c/sync-media-pub/master/samples/single-document/index.html
- https://raw.githack.com/w3c/sync-media-pub/feature/custom-read-aloud-player/samples/single-document/index.html

Interested in hearing your thoughts!

Marisa DeMeglio
DAISY Consortium


Begin forwarded message:

From: Marisa DeMeglio <marisa.demeglio@gmail.com<mailto:marisa.demeglio@gmail.com>>
Subject: Drafts and a sample
Date: June 8, 2019 at 6:58:13 PM PDT
To: W3C Synchronized Multimedia for Publications CG <public-sync-media-pub@w3.org<mailto:public-sync-media-pub@w3.org>>

Hi all,

As we discussed at the web publications F2F last month, we have some drafts up for review:
https://w3c.github.io/sync-media-pub/

Have a look specifically at the proposed synchronization format:
https://w3c.github.io/sync-media-pub/narration.html

And how to include it with web publications:
https://w3c.github.io/sync-media-pub/packaging.html

I’ve extracted the issues from our previous drafts and discussions, and put them in the tracker:
https://github.com/w3c/sync-media-pub/issues

I also started putting together a sample and playing around with some ideas for a simple proof-of-concept for playback:
https://github.com/w3c/sync-media-pub/tree/master/samples/single-document

(for anyone really interested and wanting to dig in: it needs to be more clever about how it uses the audio api - the granularity of timeupdate in the browser isn’t very good).

Please feel free to comment, propose solutions, and otherwise share your thoughts.

Thanks
Marisa




--
Using Opera's mail client: http://www.opera.com/mail/
Received on Wednesday, 12 June 2019 11:57:23 UTC