- From: Silvia Pfeiffer <silviapfeiffer1@gmail.com>
- Date: Fri, 23 Oct 2015 21:07:20 +1100
- To: Cyril Concolato <cyril.concolato@telecom-paristech.fr>
- Cc: public-texttracks@w3.org, Nigel Megitt <nigel.megitt@bbc.co.uk>
- Message-ID: <CAHp8n2=z_rOXWs3f4XyX2H=JJjA+azhsP7wWgAsH=cB9jqemVA@mail.gmail.com>
On 23 Oct 2015 7:50 pm, "Cyril Concolato" <
cyril.concolato@telecom-paristech.fr> wrote:
>
> Le 22/10/2015 23:33, Silvia Pfeiffer a écrit :
>>
>>
>> Just so we are clear: this already exists and works well for WebVTT.
>>
> I don't know what you mean by 'exists'. There is a pull request on-going
but it is not (to my knowledge) implemented in any browser (is it?) or
authoring tool, which means it can be removed if never implemented.
With "this" I was referring to the creation of
A sequences of indolent documents like requested by Nigel. I think you
agree that Apple's use of WebVTT with HLS works that way. That's all I was
referring to m
>>
>> It has limitations though such as having to repeat styles across the
segments. Which is why we are discussing alternative approaches. HTH.
>>
> And an alternative approach might in the future replace or complement the
current approach if the group feels like the limitations are too strong, no?
Yes. I was merely replying to Nigel to explain why this discussion is
happening.
Cheers,
Silvia.
> Cyril
>>
>>
>> Cheers,
>> Silvia.
>>
>> Best Regards,
>> Silvia.
>>
>> On 23 Oct 2015 12:52 am, "Nigel Megitt" <nigel.megitt@bbc.co.uk <mailto:
nigel.megitt@bbc.co.uk>> wrote:
>>
>> It would also be possible to take the same approach with VTT as we
>> have
>> taken with TTML, which is that you have a sequence of independent
>> documents each of which contains the styling etc needed to display
>> itself,
>> for whatever time period applies. Then you have something
>> deliverable that
>> will work, and you can separate out the problem of creating a
>> single long
>> document that contains "all the previous documents' content" into a
>> different processing task. If you go down the route of timed
>> styles then
>> you're almost at that point anyway.
>>
>> Nigel
>>
>>
>> On 22/10/2015 14:40, "Cyril Concolato"
>> <cyril.concolato@telecom-paristech.fr
>> <mailto:cyril.concolato@telecom-paristech.fr>> wrote:
>>
>> >Le 22/10/2015 13:36, Philip Jägenstedt a écrit :
>> >> On Thu, Oct 22, 2015 at 10:47 AM, Cyril Concolato
>> >> <cyril.concolato@telecom-paristech.fr
>> <mailto:cyril.concolato@telecom-paristech.fr>> wrote:
>> >>> Le 21/10/2015 15:39, Philip Jägenstedt a écrit :
>> >>>> In the DASH/MP4/VTT software stack, is WebVTT the input or the
>> >>>>output, and
>> >>>> is it a file or a stream? AFAICT, the only issue would be with a
>> >>>>WebVTT
>> >>>> input stream (using the syntax in the spec, not any other
>> framing)
>> >>>>with
>> >>>> STYLE blocks at the end, but since streaming standalone WebVTT
>> >>>>doesn't exist
>> >>>> yet I'm uncertain if that's really what you mean.
>> >>> These are the good questions. It is currently possible to have a
>> >>> never-ending WebVTT file being produced live, delivered over
>> HTTP (e.g.
>> >>> using chunked transfer encoding). Such WebVTT 'stream' cannot
>> easily be
>> >>> consumed by a browser today because the Streams API is not
>> there yet,
>> >>>but it
>> >>> will be available in the future. Other (non-browser) WebVTT
>> >>>implementations
>> >>> can already use that today. This might require careful creation
of
>> >>>cues to
>> >>> insure that each point is a random access, but that's possible
>> today.
>> >>> Several services can be done based on that: think of a
>> WebRadio with
>> >>> subtitling. Regarding MP4 packagine, an implementation could
>> consume
>> >>>such
>> >>> stream and produces MP4 segments on the fly, if needed.
>> >>>
>> >>> For those implementations, if a new untimed style header would
>> arrive
>> >>>in the
>> >>> input WebVTT stream and if such style would be defined to have
>> effects
>> >>>on
>> >>> the whole 'file', i.e. including to cues prior in the 'file',
then
>> >>>playing
>> >>> the live stream versus recording the stream and then playing a
>> file
>> >>>would
>> >>> not have the same result. That would be problematic. That's why I
>> >>>think that
>> >>> styles should either be in the header (with semantics that
>> they are
>> >>>valid
>> >>> for the whole file and without the ability to be in between
>> cues) or
>> >>>as a
>> >>> timed block with a well defined time validity (like cues), or as
>> >>>settings of
>> >>> a cue. For the last two options, it really looks like WebVTT
would
>> >>>become a
>> >>> multiplex of two types of timed data (cue and styles), I'm not
>> sure we
>> >>> should go in this direction and if a separate style file/stream
>> >>>wouldn't be
>> >>> better.
>> >> Do you have a pointer to such a never-ending WebVTT file
>> deployed on
>> >> the public web?
>> >No I don't, but that does not mean that it does not exist nor that
we
>> >should break such scenario.
>> >> I honestly didn't think they would exist yet.
>> >>
>> >> To be pendantic, the reason that never-ending WebVTT files
>> don't work
>> >> in browsers isn't because of the Streams API, but because the
media
>> >> element's readyState cannot reach HAVE_FUTURE_DATA until the text
>> >> tracks are ready:
>> >>
>> >>
https://html.spec.whatwg.org/multipage/embedded-content.html#the-text-tra
>> >>cks-are-ready
>> >>
>> >> This is what the spec bug is about, some mechanism to unblock
>> >> readyState before text track parsing has finished:
>> >> https://www.w3.org/Bugs/Public/show_bug.cgi?id=18029
>> >Sorry, I wasn't clear. I know about that bug. I was already assuming
>> >that a web app would fetch the WebVTT (using XHR or fetch,
retrieving
>> >the text content as a stream), parse it and produce the cues in
>> JS, not
>> >at all using the native browser support, because of that exact bug.
>> >>
>> >> Anyway, letting the parser discard style blocks after any cues
>> until
>> >> we've figured out the live streaming issues is OK with me.
However,
>> >> let's spell out the implications of keeping this restriction
>> for live
>> >> streams:
>> >I agree that it's the right approach. We should be aware of the
>> >limitations of such approach.
>> >> If you don't know all of the style up front,
>> >I agree that "if you don't know all of the style up front" you have
a
>> >problem to solve. Nigel already pointed that out, as being useful in
>> >broadcast where you don't necessarily know in advance all your
>> styles.
>> >To me, there are 2 main approaches: using timed styles or refreshing
>> >untimed styles.
>> >
>> >By timed styles, I imagine something like:
>> >
>> >00:01:00.000 --> 00:02:00.000 type:style
>> >.myRedClass {
>> > color: red;
>> >}
>> >.myGreenClass {
>> > color: green;
>> >}
>> >
>> >00:01:00.000 --> 00:01:30.000
>> ><v.myGreenClass>Some green text
>> >
>> >00:01:20.000 --> 00:02:00.000
>> ><v.myRedClass>Some red text
>> >
>> >A cue of with a 'type' settings whose value is 'style' carries style
>> >content not text content. This has the advantage of giving precise
>> >timing for the styles, and we can force styles to appear in start
>> time
>> >order (like cues) and before a cue that has a similar start time.
>> There
>> >are probably problems with the syntax (blank lines in CSS, I did not
>> >follow that part of the discussion). Also, if you want to have
>> seekable
>> >streams you probably would have to split cues to remove overlap
>> (nothing
>> >different from normal cues).
>> >
>> >Alternatively, I could also imagine something simpler like:
>> >00:01:00.000 --> 00:01:30.000 style:color:green;
>> >Some green text
>> >
>> >00:01:20.000 --> 00:02:00.000 style:color:red;
>> >Some red text
>> >
>> >Maybe this could modified to import styles instead of inlining
>> them, I
>> >didn't think about that. Also, as I pointed out in my previous
email,
>> >such VTT file starts to become a multiplex with styles and
>> content. It
>> >may be more appropriate to define a Style stream (maybe using the
>> WebVTT
>> >syntax) and to link the style stream with the content stream, either
>> >from the WebVTT content file or from an additional <track> element.
>> >> your only
>> >> recourse is to add a new text track at the point where new style
is
>> >> needed.
>> >Without defining timed styles (as above), adding a new text track
>> is an
>> >option, but not the only one, you can use one text track and fill it
>> >with cues coming from different WebVTT files. In the HLS
>> approach, every
>> >WebVTT segment would (re-)define its styles. That does not mean
>> you have
>> >to maintain multiple tracks.
>> >> This will involve scripts, at which point handling multiple
>> >> WebVTT tracks will compare unfavorably with just using a WebSocket
>> >> connection to deliver cues and style using a custom syntax.
>> >Maybe in some cases the WebSocket approach can be useful, but
>> there are
>> >other issues as well like caching.
>> >
>> >--
>> >Cyril Concolato
>> >Multimedia Group / Telecom ParisTech
>> >http://concolato.wp.mines-telecom.fr/
>> >@cconcolato
>> >
>> >
>>
>
>
> --
> Cyril Concolato
> Multimedia Group / Telecom ParisTech
> http://concolato.wp.mines-telecom.fr/
> @cconcolato
>
Received on Friday, 23 October 2015 10:07:50 UTC