Re: Synchronized narration work from the Sync Media Pub CG from Ingar Mæhlum Arntzen on 2019-06-12 (public-web-and-tv@w3.org from June 2019)

From: Ingar Mæhlum Arntzen <ingar.arntzen@gmail.com>
Date: Wed, 12 Jun 2019 16:30:39 +0200
To: Nigel Megitt <nigel.megitt@bbc.co.uk>
Cc: "Charles 'chaals' (McCathie) Nevile" <chaals@yandex.ru>, "public-web-and-tv@w3.org" <public-web-and-tv@w3.org>, "public-audio-description@w3.org" <public-audio-description@w3.org>, Marisa DeMeglio <marisa.demeglio@gmail.com>, Daniel Weck <daniel.weck@gmail.com>
Message-ID: <CAOFBLLo+OqqL1T453vdZs-dMkNVfH0Hiq_3TqKB0W_Kd1UvJ_Q@mail.gmail.com>
Hi Nigel

Hm. Yes, paging makes it more of a dynamic problem. You don't know the page
boundaries until you are almost there. However, at some point you will know
or will be able to calculate the page boundaries, and then the sequencer
could be handy after all. The Sequencer is specifically made to support
dynamic datasets, so if you have to recalculate your cues based on page
layout changes or similar events, that would be a perfect job for it.

Timing objects aren't just for linear fixed rate movement. We use them for
discrete movements (e.g next page/chapter). Also, we've used them in
circumstances where you have movement at irregular rate (e.g.
scroll-movements or mouse pointers), by letting the timing objects
represent a linear appoximation of irregular movement, or in the worst cast
simply represent this as high-frequency discrete steps.

I don't want to be making any claims regarding the appropriateness of the
timing model in the content-domain, you are quite right that it is devised
primarily for timed media.

On the other hand, we have fairly practical solutions for use cases that
appear quite similar to use cases brought up as challenging in this domain,
so I thought I should mention that.

Cheers,

Ingar






Timing objects are

ons. 12. jun. 2019 kl. 15:48 skrev Nigel Megitt <nigel.megitt@bbc.co.uk>:

> Hi Ingar,
>
> Thanks for the suggestion. Right now, it doesn’t fit my world view:
> whereas sequencers and timing objects seem well suited to a linear
> progression, the publishing, content-based domain does not.
>
> My example use case here is that one about paged presentation. At
> authoring time you don’t necessarily know where the page boundaries will
> fall. But you might want to trigger an event (the same one every time)
> based on traversal of a page boundary. That doesn’t look like returning to
> the same point on the sequence/time line each time, it looks like inserting
> events on the sequence based on the content and the rendering system, and
> potentially changing those every time the layout changes.
>
> Also, the progression through the media isn’t at a constant rate.
>
> To my mind (which always likes to think it is open to changing!) the
> mismatch between the timing/sequence model and the content model is just
> too big to justify coercing them to be the same. It could be that there’s a
> useful abstraction that can be specialised to either one case or the other.
> I just haven’t seen it yet.
>
> Nigel
>
> From: Ingar Mæhlum Arntzen <ingar.arntzen@gmail.com>
> Date: Wednesday, 12 June 2019 at 14:06
> To: Nigel Megitt <nigel.megitt@bbc.co.uk>
> Cc: "Charles 'chaals' (McCathie) Nevile" <chaals@yandex.ru>, "
> public-web-and-tv@w3.org" <public-web-and-tv@w3.org>, "
> public-audio-description@w3.org" <public-audio-description@w3.org>,
> Marisa DeMeglio <marisa.demeglio@gmail.com>, Daniel Weck <
> daniel.weck@gmail.com>
> Subject: Re: Synchronized narration work from the Sync Media Pub CG
>
> Hi Nigel.
>
> If I understand you correctly, the two modes you are referring to are
> playback of items in time-domain and playback in some other domain, e.g.
> content-ordering domain.
>
> Using timing objects and sequencers it is pretty straightforward to
> support both modes (as well as shifting dynamically between them). For
> instance, we use independent timing objects for the different domains where
> we need playback, and then we register the data with sequencers driven by
> the different timingobjects. UI can then dispatch interaction events to the
> correct playback controls (i.e.. timing object(s)).
>
> I don't think there are any particular requirements on the data
> representation to make this work. If you can deduce how a particular data
> item relates to a given axis (time, ordering or whatever), then you
> register sequencer cues for the item, an have enter/exit events delivered
> at the correct time.
>
> In many applications it would then be a matter of opinion which counts as
> "primary" content. For instance, with a slide show presentation synced to a
> video (of the slide-show presenter) you could navigate the session by video
> controls whenever that makes sense, or slide-show controls in the
> interactive slide viewer (on a different device). One would also want to
> support dynamic switching as you leave a "primary-mode-reading" and enter
> "primary-mode-listen", perhaps because your circumstances change
> mid-session.
>
> I guess a key source for this flexibility in the timing object model is
> the insight that no media should be "primary" in a technical sense,
> implying that media control must be separate from media.
>
> Ingar Arntzen
> Multi-device Timing Community Group
>
>
>
>
>
>
>
>
> ons. 12. jun. 2019 kl. 13:59 skrev Nigel Megitt <nigel.megitt@bbc.co.uk>:
>
>> Hi,
>>
>> This is interesting work, and I’m particularly interested in the context
>> of work we’re doing in the Audio Description Community Group, which is
>> related but different.
>>
>> Perhaps in answer to Chaals’s question, there’s a technical difference
>> here that feels like we ought to be able to work around, but is actually
>> quite fundamental to the way that WebVTT and TTML work, for example, but
>> not this use case.
>>
>> Whereas timed text formats are predicated on media *times*, the
>> requirement for this application, if I’ve understood it correctly, is to
>> link the audio rendition to element content. This is why it works that you
>> can click on some text and hear the audio for that directly. The text isn’t
>> acting as a play-head-position link, i.e. “on click, move play head to
>> 13.2s”, but there is a mode of operation where you hear the consecutive
>> elements’ linked audio being played consecutively, as if it is continuous
>> media, with a link back to the highlighting of each separate snippet of
>> text.
>>
>> I’ve seen plenty of examples of timed transcripts of videos where the
>> text at the play head time is highlighted and the user can click on any
>> random place in the transcript to make the play head jump there, but I
>> think there’s a semantic mismatch between that experience and this one –
>> here the text content is the primary thing, not the video/audio.
>>
>> I recall a use case being discussed previously as well where, on a paged
>> view of the text, a sound effect can be played each time there’s a page
>> turn. This approach is amenable to that, assuming that the same fragment id
>> can be reused on different elements in the source document.
>>
>> I’d love to see a single approach that could make both use cases work,
>> but I’m not sure what it would look like. SMIL probably got somewhere very
>> close, using the event based model, where events could be generated by time
>> on a media playback timeline or via some other API that fires them, but I
>> sense that the willingness to implement all of the complexity involved in
>> SMIL may be dwindling, and in any case, it doesn’t by itself resolve the
>> problem of how to express timing *and* element content event triggering
>> in a timed text document format.
>>
>> Kind regards,
>>
>> Nigel
>>
>>
>> From: "Charles 'chaals' (McCathie) Nevile" <chaals@yandex.ru>
>> Date: Wednesday, 12 June 2019 at 09:58
>> To: "public-web-and-tv@w3.org" <public-web-and-tv@w3.org>, "
>> public-audio-description@w3.org" <public-audio-description@w3.org>,
>> Marisa DeMeglio <marisa.demeglio@gmail.com>
>> Cc: Daniel Weck <daniel.weck@gmail.com>
>> Subject: Re: Synchronized narration work from the Sync Media Pub CG
>> Resent-From: <public-audio-description@w3.org>
>> Resent-Date: Wednesday, 12 June 2019 at 09:59
>>
>> Hi,
>>
>> Thanks for the pointer.
>>
>> I'm curious why you wouldn't use e.g. WebVTT or another existing markup
>> that carries the associations between text and audio renderings.
>>
>> cheers
>>
>> Chaals
>>
>> On Wed, 12 Jun 2019 02:12:26 +0200, Marisa DeMeglio <
>> marisa.demeglio@gmail.com> wrote:
>>
>> Hi all,
>>
>> Chris Needham suggested that I share with you what we’ve been working on
>> in the Synchronized Media for Publications CG - it’s a lightweight JSON
>> format for representing pre-recorded narration synchronized with HTML, to
>> provide an accessible reading experience. The primary use case is in web
>> publications (we are involved in the Publishing WG), but it has been
>> designed to live as a standalone “overlay” for HTML documents. Below are
>> some links to the latest drafts.
>>
>> And, just to give you an idea of the basic user experience, here are two
>> slightly different proof of concept demos for the sample in our repository:
>>
>> -
>> https://raw.githack.com/w3c/sync-media-pub/master/samples/single-document/index.html
>> -
>> https://raw.githack.com/w3c/sync-media-pub/feature/custom-read-aloud-player/samples/single-document/index.html
>>
>>
>> Interested in hearing your thoughts!
>>
>> Marisa DeMeglio
>> DAISY Consortium
>>
>>
>> Begin forwarded message:
>>
>> *From: *Marisa DeMeglio <marisa.demeglio@gmail.com>
>> *Subject: **Drafts and a sample*
>> *Date: *June 8, 2019 at 6:58:13 PM PDT
>> *To: *W3C Synchronized Multimedia for Publications CG <
>> public-sync-media-pub@w3.org>
>>
>> Hi all,
>>
>> As we discussed at the web publications F2F last month, we have some
>> drafts up for review:
>> https://w3c.github.io/sync-media-pub/
>>
>> Have a look specifically at the proposed synchronization format:
>> https://w3c.github.io/sync-media-pub/narration.html
>>
>> And how to include it with web publications:
>> https://w3c.github.io/sync-media-pub/packaging.html
>>
>> I’ve extracted the issues from our previous drafts and discussions, and
>> put them in the tracker:
>> https://github.com/w3c/sync-media-pub/issues
>>
>> I also started putting together a sample and playing around with some
>> ideas for a simple proof-of-concept for playback:
>> https://github.com/w3c/sync-media-pub/tree/master/samples/single-document
>>
>> (for anyone really interested and wanting to dig in: it needs to be more
>> clever about how it uses the audio api - the granularity of timeupdate in
>> the browser isn’t very good).
>>
>> Please feel free to comment, propose solutions, and otherwise share your
>> thoughts.
>>
>> Thanks
>> Marisa
>>
>>
>>
>>
>>
>> --
>> Using Opera's mail client: http://www.opera.com/mail/
>>
>>
Received on Wednesday, 12 June 2019 14:31:17 UTC