Re: Media Capture and Streams Last Call review; deadline May 15 ( LC-3013) from Dominique Hazael-Massieux on 2015-08-17 (public-media-capture@w3.org from August 2015)

From: Dominique Hazael-Massieux <dom@w3.org>
Date: Mon, 17 Aug 2015 14:50:08 +0200
To: Nigel Megitt <nigel.megitt@bbc.co.uk>, "public-media-capture@w3.org" <public-media-capture@w3.org>, "public-pfwg@w3.org" <public-pfwg@w3.org>
Message-ID: <55D1D880.3050505@w3.org>
Hi Nigel,

Any feedback on my reply below? I would like to close the loop on this 
if possible; at the very least, it would be useful to understand if you 
intend to formally object to our disposition of your comment or not so 
that we can understand what our next steps with progressing that 
specification will need to be.

Thanks,

Dom

On 21/07/2015 15:02, Dominique Hazael-Massieux wrote:
> Hi Nigel,
>
> On 10/07/2015 16:11, Nigel Megitt wrote:
>>
>> Thank you for your response to my comment. I agree that the current WD
>> does not deal with streams of data related to the media, so in that sense
>> you have provided an accurate answer. However I am far from certain that
>> this is acceptable. As far as I can see this constraint prevents WebRTC
>> both from being augmented with accessibility data for example
>> subtitles/captions and from being augmented with other data-based
>> functionality such as the display of text or graphics not associated with
>> accessibility.
>
> The fact that this particular API doesn't provide the necessary hooks
> doesn't imply it's not doable with WebRTC.
>
> Indeed, for something like subtitle and captioning, you can already
> re-use the existing synchronization mechanisms provided by HTML media
> elements (e.g. ontimeupdate events) to display text synchronously with
> the content captured via getUserMedia.
>
> You could even use WebRTC data channels to transmit these captions if
> they are sourced from the same browser as the video/audio are.
>
> But the specific API we're talking about (Media Capture and Streams) is
> not specific to WebRTC; it strictly focuses on capturing media streams,
> and formalizing their synchronization semantics, not how they can be
> then transmitted or possibly synchronized with other out-of-band content.
>
>> I note that the Working Group Charter lists a dependency on WAI Protocols
>> and Formats Working Group: "Reviews from the WAI PF Working Group will be
>> required to ensure the APIs allow to create an accessible user
>> experience."
>
> We've solicited feedback from the WAI PFWG both directly and via the
> HTML Accessibility Task Force, but haven't heard back so far. I'm trying
> to get information as to whether we should expect any.
>
>> I am not a member of WAI PFWG but have copied in
>> public-pfwg@w3.org to this message to ensure they have visibility of my
>> comment: at present I believe that the APIs do not "allow to create an
>> accessible user experience."
>
> If you're talking specifically about synchronizing subtitles or
> captions, I think the APIs, taken with the rest of the platform, do
> allow to create an accessible user experience.
>
> If you're thinking of some other use cases, could you clarify which ones?
>
> If you don't think my assumptions about the possibility of using
> synchronization events for captions/subtitles for an accessible user
> experience hold, could you describe in more details why they're not
> sufficient? This would go a long way toward understanding what we would
> need to change in the API.
>
>> I would suggest it should be a matter of priority for the Working
>> Group to
>> consider adding this capability. You request a proposal for a specific
>> solution for this. One possible solution would be to extend the
>> MediaStreamTrack.kind attribute to permit the value "data" and to have a
>> further more specific type so that user agents can process data tracks
>> successfully.
>
> But why would they need to be put into a MediaStreamTrack object when
> they're not media content? What benefit is there to try and them in that
> structure instead of keeping that as out-of-band data?
>
>> It may also be helpful or necessary to expose a common clock
>> with which such data may be synchronised - further design work to
>> establish the importance of this would be needed.
>
> I believe that for captioning, the clock provided by ontimeupdate
> provides sufficient accuracy; but again, I may be missing something
> here, so would welcome your input as to why they would not.
>
>> An example of the usage scenario could be the provision of a sequence of
>> TTML or WebVTT documents which, on presentation, provide
>> subtitles/captions for the video or audio content. This could be achieved
>> by having a MediaStreamTrack of kind "data" and subtype "ttml+xml" in the
>> case of TTML.
>
> Clearly being able to play TTML or WebVTT documents along with playing a
> video or audio obtained from a MediaStream is useful; but why would they
> need to be provided in the same container as the media stream itself? as
> far as I know, for other video sources, these documents are provided out
> of band and synchronized by the client; this should apply with media
> streams obtained from getUserMedia as well, without having to force them
> into a MediaStream structure for which they're not fitted.
>
> Thanks for working with us on this! If it would be helpful to have a
> call to make faster progress or discuss some ideas in more details, let
> me know!
>
> Dom
>
>
Received on Monday, 17 August 2015 12:50:14 UTC