Re: chunked IMSC1 from Cyril Concolato on 2022-03-18 (public-tt@w3.org from March 2022)

From: Cyril Concolato <cconcolato@netflix.com>
Date: Fri, 18 Mar 2022 08:15:35 -0700
To: Michael Dolan <mike@dolan.tv>
Cc: Nigel Megitt <nigel.megitt@bbc.co.uk>, Glenn Adams <glenn@skynav.com>, "public-tt@w3.org" <public-tt@w3.org>
Message-ID: <CAMiyXwBiUqerq52xp36Y4_PvY32sQNtPy4JT2verzHHFbVXkPA@mail.gmail.com>
On Fri, Mar 18, 2022 at 7:32 AM Michael Dolan <mike@dolan.tv> wrote:

> Hi Nigel,
>
>
>
> Unlike TTML, there is no need to wait for a full segment of video or audio
> to start sending the packets and for the decoder to start to decode and
> present them long before the whole segment arrives.  Video and audio is
> “chunked” today and transmitted long before the encoding of the entire
> segment.
>
>
>
> TTML cannot have more than one sample/segment (14496-30).
>
ISO/IEC 14496-30 is about overall carriage of TTML in MP4. It does not
impose any application specific constraints, in particular does not
constrain segmented media. CMAF talks about segmented media but I don't
recall any restriction. Maybe DASH-IF has such a restriction?

That said, the general solutions for progressively downloading XML are well
known and applicable to TTML. In particular, if your TTML parser is a
SAX-based, progressive parser and the document is authored with proper
constraints (i.e. document order matches time order), you could use HTTP
chunked transfer (e.g. mapping one HTTP chunk to a <p> or a <div>) of TTML.

HTH,
Cyril


>
>
> The minimum CMAF Segment duration (960ms) applies to all codecs.  And
> changing it would increase the coding overhead. The goal is longer, more
> efficient, segments that are chunked, not shorter inefficient ones.
>
>
>
>               Mike
>
>
>
> *From:* Nigel Megitt <nigel.megitt@bbc.co.uk>
> *Sent:* Friday, March 18, 2022 7:19 AM
> *To:* Michael Dolan <mike@dolan.tv>; Glenn Adams <glenn@skynav.com>
> *Cc:* public-tt@w3.org
> *Subject:* Re: chunked IMSC1
>
>
>
> Hi Mike,
>
>
>
> In these low latency scenarios, what is the expected latency for encoding
> the video? Might it be possible to send the TTML for the whole segment
> before all the video has been encoded, and thus work around the “holding
> back” problem?
>
>
>
> We demonstrated with EBU-TT Live that, given an appropriate carriage
> mechanism, it is possible to send real time updates that are whole TTML
> documents. In the case of EBU-TT Live we were typically sending whole
> (short) documents at arbitrary times, corresponding to changes of
> presentation, but it would work equally well to send documents at
> predetermined intervals. If the issue is CMAF minimum segment durations,
> another solution might be to construct each CMAF TTML Segment out of
> multiple samples, where each sample is a whole document. Or change the CMAF
> minimum segment duration, of course.
>
>
>
> Glenn, when you referenced EXI, were you talking particularly about EXI
> *streaming*, rather than EXI as a mechanism for compression?
>
>
>
> Nigel
>
>
>
>
>
>
>
> *From: *Michael Dolan <mike@dolan.tv>
> *Date: *Friday, 18 March 2022 at 13:50
> *To: *Glenn Adams <glenn@skynav.com>, Nigel Megitt <nigel.megitt@bbc.co.uk
> >
> *Cc: *"public-tt@w3.org" <public-tt@w3.org>
> *Subject: *RE: chunked IMSC1
>
>
>
> Hi Nigel and Glenn,
>
>
>
> This use case is for the live, low latency (LLL) scenario.  There is no
> reason to do it in VoD scenarios that TTML was designed for.
>
>
>
> The problem is that, unlike video and audio, “normal TTML” cannot **begin**
> to be decoded until after encoding and reception of the entire
> segment/document.  This means that video and audio segments must be “held
> back” for the segment duration so that the decoding and presentation
> remains in sync. Given that LLL applications expect on the order of 500ms
> delay at worst, this just doesn’t work especially when, e.g. CMAF segments,
> are constrained to >960ms.  Decoding and presentation must necessarily
> begin with a partial segment, like video and audio.
>
>
>
> Yes, a solution will likely require slightly special encoding and decoding
> processing and perhaps a constrained vocabulary (e.g. no <set>), although I
> would not postulate the complexity or issues at this time. That would be
> for further study.
>
>
>
> The alternative is frankly not to use TTML.
>
>
>
>               Mike
>
>
>
> *From:* Glenn Adams <glenn@skynav.com>
> *Sent:* Friday, March 18, 2022 6:33 AM
> *To:* Michael Dolan <mike@dolan.tv>
> *Cc:* public-tt@w3.org
> *Subject:* Re: chunked IMSC1
>
>
>
> Neither IMSC nor TTML reqs explicitly address this use case. Both operate
> on "document instances" as their input and require such instances to be
> well-formed (in an XML sense).
>
>
>
> To do what you suggest, it would be necessary to progressively reparse a
> dynamically updated concretely encoded document instance, and only proceed
> with subsequent processing (as an XML infoset) for parses that proved well
> formed. This might require an underlying buffering layer to append a
> temporary postfix to the buffer prior to each parse attempt in order to
> supply missing close tags.
>
>
>
> If I were designing such a system, I would first evaluate the potential
> use of EXI <https://www.w3.org/TR/exi/>.
>
>
>
>
>
> On Thu, Mar 17, 2022 at 2:34 PM Michael Dolan <mike@dolan.tv> wrote:
>
> All,
>
>
>
> Has anyone thought about this lately?  In the TTML1 days we pondered it a
> bit.  By “chunked” I mean in the HTTP sense.  That is, an IMSC1 document
> could be broken into pieces and delivered a few bytes at a time say every
> 500ms.  There are low latency use cases that need this sort of delivery
> where decode and presentation continues throughout a certain period. Today,
> when building ISO BMFF segments, one has to gather up all text over several
> seconds, create a well-formed document, and then deliver that well-formed
> document before presentation can begin. This inserts a delay relative to
> how 608 and Teletext work. And, it inserts the same delay into the video
> and audio as well – that is, the decoder cannot start decoding video and
> audio until the text is ready to go.
>
>
>
> Even if no one has pondered this lately, is there interest?
>
>
>
> Regards,
>
>               Mike
>
>
>
> "*keep calm and carry on"*
>
> -----------------------
>
> Michael DOLAN
>
> TBT Inc
>
> Del Mar, CA USA
>
> +1-858-882-7497 (mobile)
>
>
>
>
Received on Friday, 18 March 2022 15:17:00 UTC