W3C home > Mailing lists > Public > public-texttracks@w3.org > September 2012

Re: Streaming of WebVTT

From: Cyril Concolato <cyril.concolato@telecom-paristech.fr>
Date: Wed, 26 Sep 2012 17:42:14 +0200
Message-ID: <50632256.6040708@telecom-paristech.fr>
To: public-texttracks@w3.org
Hi Silvia,

Le 9/19/2012 3:11 PM, Silvia Pfeiffer a écrit :
> Hi Cyril,
> I only just noticed your post here - I saw your blog post earlier
> though and have already posted a reply there.
> Feel free to take the discussion here if you prefer on continue it on your blog.
Thank you for your comments and in particular for registering the Chrome 
bug and for pointing me to to the long (sometimes off-topic) discussion 
on live webvtt streaming.

> I actually think there are two cases that we have to regard separately 
> for streaming WebVTT: one is where we have the WebVTT file grow as an 
> individual resource independent of the video, and the other is where 
> WebVTT is provided in-band.
I agree with you. I called them live (non-broadcast) and broadcast (not 
necessarily live).

> I think the first case where the WebVTT file is just a text file that 
> grows is not too difficult to resolve. The video player would connect 
> to the video stream at a certain offset, get the time offset of that 
> time and then get the WebVTT file and display the cues from that time 
> offset onwards. Right now, this is possible in a Web browser when 
> writing the code for pulling a WebVTT file that continues to grow on 
> the server through XHR in JS. I don't think, though, that the browsers 
> will do the right thing with a track element yet. That's why we have a 
> bug at the W3C:https://www.w3.org/Bugs/Public/show_bug.cgi?id=14104.
This seems simple indeed but not so. Getting the offset from the video 
and finding the corresponding cues assume that there is an easy mapping. 
For instance, when you get a live video feed, the timestamp of the first 
video frame that the player receives can be any value. You need to map 
that timestamp to a presentation time whose origin is also shared by the 
VTT file. As indicated here 
https://www.w3.org/Bugs/Public/show_bug.cgi?id=14104#c3, the currentTime 
in HTML5 can't give you that. For instance, in an MPEG-2 broadcast, if 
you join 30seconds after the beginning of the session (the client is not 
aware of that 30s), how do you map the PCR value X (!= 30) of the first 
video frame to 30 s. Apparently, the HLS solution uses a new attribute 
http://tools.ietf.org/html/draft-pantos-http-live-streaming-09). In MPEG 
DASH, the mapping is at the core of the spec, from the presentation time 
you get the video segment and from the same presentation time you get a 
vtt segment.

> If at a later stage somebody was to extract a WebVTT file again from a 
> multiplexed and recorded live stream, the repeated cues need to be 
> thrown away and would thus not pollute the text WebVTT file.
Regarding the repetition. I agree with you, but my point was to show 
that if you split the overlapping cues (and order them properly), the 
result can be: played, without throwing anything away; random-accessed 
easily; played seamlessly (no gap between cues) and identically to what 
it was before splitting. The cost is the file size increase which 
depends on the number of RAPs you want to have. Also it works in 
unmultiplexed streams.


> Cheers,
> Silvia.
> On Wed, Sep 19, 2012 at 7:24 AM, Cyril Concolato
> <cyril.concolato@telecom-paristech.fr> wrote:
>> Hi all,
>> As a follow up of this thread, I wrote a blog post about some further
>> experiments I made towards the streaming of WebVTT:
>> http://concolato.wp.mines-telecom.fr/2012/09/12/webvtt-streaming/
>> In summary, it seems possible to generate WebVTT streams, with good random
>> access properties, that can be delivered in chunks and still be processed by
>> standard browsers.
>> Comments are welcome.
>> Cyril
>> --
>> Cyril Concolato
>> Maître de Conférences/Associate Professor
>> Groupe Multimedia/Multimedia Group
>> Telecom ParisTech
>> 46 rue Barrault
>> 75 013 Paris, France
>> http://concolato.wp.mines-telecom.fr/

Cyril Concolato
Maître de Conférences/Associate Professor
Groupe Multimedia/Multimedia Group
Telecom ParisTech
46 rue Barrault
75 013 Paris, France
Received on Wednesday, 26 September 2012 15:42:48 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 20:27:20 UTC