Re: Summary of "What is missing for building real services" thread

Text streams are really a subclass of data streams. They could be sent
as WebVTT, but I think in live captioning situations you will not find
much time to do any text formatting. The use of WebVTT in live
situations is more likely to be useful for streaming where WebVTT is
provided either in-band or in chuncks via MSE (or HTTP adaptive
streaming approaches).

In the case of video conferencing, captions would likely be
transmitted just via a data channel with individual letters or words
sent as soon as they are typed. WebVTT could come in for recording
here. I doubt it would be useful for transmission.


On Thu, Jan 16, 2014 at 11:19 AM, Rob Manson <> wrote:
> I definitely think there's a role for metadata tracks - and perhaps that
> relates more closely to VTT too (Silvia may like to comment on that).
> But I think that's a separate discussion - happy to have it added to the
> list though.
> roBman
> On 16/01/14 11:06 AM, wrote:
>>> At the moment the only streams that can be used are the ones generated by
>>> gUM. It would also be useful to be able to add post-processed streams (or
>>> probably more accurately MediaStreamTracks e.g. use face tracking to mask
>>> a
>>> persons face, or use object detection to highlight a specific object,
>>> etc.).
>>> At the moment the only way to send this data to the remote client is via
>>> another channel (e.g. DC or WS)...and sync'ing is definitely an issue
>>> there.
>> Now that you say this: TextStreams and (uni-directional) DataStreams,
>> for example on-wire video subtitles or metadata (identifiers,
>> coordinates...) of the detected objects, so that info can be send
>> synced to the other end. Theoretically, they should be just subclasses
>> of MediaStreamTracks, and we have had already at my job problems
>> trying to sync that info...

Received on Thursday, 16 January 2014 02:21:08 UTC