{Minutes} TTWG Meeting 2021-06-24

Thanks all for attending today's TTWG meeting. Minutes can be found in HTML format at https://www.w3.org/2021/06/24-tt-minutes.html


In text format:

   [1]W3C

      [1] https://www.w3.org/


                Timed Text Working Group Teleconference

24 June 2021

   [2]Previous meeting. [3]Agenda. [4]IRC log.

      [2] https://www.w3.org/2021/06/10-tt-minutes.html

      [3] https://github.com/w3c/ttwg/issues/188

     [4] https://www.w3.org/2021/06/24-tt-irc


Attendees

   Present
          Andreas, Atsushi, Chris_Needham, Gary, Glenn, Nigel,
          Pierre

   Regrets
          Cyril

   Chair
          Nigel

   Scribe
          cpn, nigel

Contents

    1. [5]This meeting
    2. [6]How live delivery is handled in TTML/IMSC
    3. [7]Shear calculations and origin of coordinate system.
       w3c/ttml2#1199
    4. [8]Clarify if the first ISD must/may be constructed when
       empty w3c/ttml2#1232
    5. [9]Meeting close

Meeting minutes

  This meeting

   Nigel: Today, we have a topic Gary requested, about handling
   live delivery of TTML.
   … We also have 2 issues on TTML2, which perhaps we can make
   progress on.
   … I have kept the IMSC HRM issue about spans on the agenda in
   case there is anything to discuss.

   Pierre: On the HRM thing, I haven't made much progress but I
   think we should take 10 minutes to talk about strategy.
   … How do we propagate HRM changes through IMSC 1.0.1, 1.1 and
   1.2?
   … Rather than going through the issues themselves.

   Nigel: Ok, good idea
   … Then we have an IMSC Test issue/pull request.
   … In AOB we have TPAC 2021.
   … Any other points to discuss or make sure we cover?

   group: [none]

  How live delivery is handled in TTML/IMSC

   Nigel: This was asked by Gary - perhaps we should work out if
   there is a quick answer or if we need a longer session.

   Gary: This comes from the unbounded cue discussion last week.
   … Partly because of your mention Nigel, one of the issues is
   how unbounded cues work with live captioning.
   … There is segmented captioning happening with WebVTT and HLS,
   also done for live. I have an understanding of how that works,
   … but not how TTML and IMSC does live. I suspect it might be
   similar.
   … Figured it would be similar, but wanted a sense of it so that
   if there are any specific
   … problems then we don't repeat the same issues.

   Nigel: The first thing to note with TTML is that it's an
   application layer on top of TTML
   … You probably need constructs for streaming delivery around
   TTML, provided by other things
   … I know of 4 wrappers around TTML that do this: MPEG-2, MP4
   DASH/HLS streaming, RTP, and the EBU-TT Live extensions
   … They all provide some sort of time windowing of the TTML
   document, and they send a sequence of TTML documents
   … The only one that doesn't use a wrapper is EBU-TT Live
   … The wrapper defines a time window with a beginning and a
   duration that signals: from this point time onwards, this
   single TTML document is active
   … As a consequence of that, any previously active TTML document
   is no longer active
   … The last piece of the puzzle is to align the timelines
   between the TTML payload content and the external timing
   … There are established ways to do that, defined in the wrapper
   … There could be an external timeline, e.g., an epoch such as 1
   Jan 1970 and times are relative to that. Or each TTML document
   starts at time zero relative to when document playback begins
   … The question then is how to do it for live? Live captioning
   is captioning of an audio stream that also has a separate
   mechansim for encoding, packaging, and distribution
   … The requirement, from an audience perspective, is to get the
   captions in a way that's aligned with the audio. They could be
   a little bit late, or co-timed with server side delays
   … We generally don't have completely unbounded stuff. It is not
   possible to issue documents for encoding, packaging,
   distribution until you get to the end of the time it applies to
   … You could have a single subtitle that begins at some time,
   with no end time. It would have no end or dur attribute in the
   document
   … The application semantics would say that the document stops
   being active and a new document becomes active
   … If the TTML document appears unbounded in that case, the
   application applies a bound

   Pierre: They way modern packaging and streaming formats work,
   the playlist is at a higher level than the TTML document. The
   playlist sets the bounds

   Gary: That's helpful. That aligns with what I figured, which
   also applies to segmented WebVTT
   … A question would be: do unbounded cues makes sense for TTML
   too?

   Nigel: If you want a semantic that says within the document
   there's no end time for an element, it can already be done.
   Simply omitting the end time does that

   Gary: The end time is defined by how it's delivered to the
   application. A cue with no end time or duration would be
   forcefully bounded by the media segment it's embedded in, e.g.,
   MP4

   Nigel: Yes
   … The last document stays active until you activate another
   one. In a segmented MP4 context (DASH), it's generally
   predefined when segment ends / segment durations will be

   Pierre: I've not seen a requirement for unbounded cues yet

   Gary: We'll continue with WebVTT on the use cases and deriving
   requirements. If it make sense for live captioning to have
   unbounded cues in webvtt, we could maybe also talk about
   application in TTML
   … It's still early. Not clear that having unbounded cues is a
   requirement we want to proceed with

   Nigel: Are we asking the wrong question? The conversation about
   bounded/unbounded cues starts from an assumption that a cue is
   a semantic object in its own right that a user can interact
   with
   … In the schemes I've talked about, there's not a requirement
   to semantically identify a single piece of content as it
   changes over time
   … Instead, the focus is on delivering the right presentation by
   delivering the documents
   … If a client wants to do some analysis to identify duplicates
   (for example), it's up to the application
   … Having one thing that gets updated is a different semantic
   model

   Pierre: People talk about subtitle cues, there's a good mapping
   with pop-on captions, e.g., on a DVD. It breaks down when you
   talk about progressive subtitling, where words appear
   additively, paint-on
   … Or where lines appear in the same region. In those scenarios
   the concept of a subtitle doesn't work at all
   … There's no such thing as "a subtitle". The TTML model is text
   flowed in a region

   Gary: I think that for the document that Chris started we're
   trying to separate high level use cases (e.g., live captioning)
   from requirements - e.g., create unbounded cue so we can
   deliver earlier, for example
   … So we want the right use cases. The individual cues are less
   important than knowing that we're capturing spoken word

   Nigel: Have we answered the question?

   Gary: It does for me

   Chris: The question I have is around other kinds of metadata.
   In WebVTT I think it's possible
   … that you can annotate chapter points, or denote segmentation
   of the content.
   … In that case if you're starting a new chapter, which says
   "this news segment just started"
   … and it starts now and we don't know the end time,
   … is there any equivalent model in TTML for that kind of use
   case.

   Pierre: Yes, absolutely. It's possible in TTML to specify that
   an element has an undefined end time.

   Chris: And then it becomes application specific how to
   interpret that.

   Pierre: What to do with it. The interpretation in TTML is
   pretty unambiguous.
   … Then do you leave it undefined or clip the presentation to
   some value.

   Nigel: There's nothing to stop you adding your own metadata to
   an element, e.g., to indicate which chapter you're in
   … That segmentation applies to content rather than there being
   some other "thing" that has start or end time signalled

   Pierre: As I understand it, WebVTT came from SRT, which came
   from DVD subtitles. That's a really specific use case,
   subtitles for translation. It's all pop-on

   Gary: It has other mechanisms

   Pierre: People say "a cue" or "a subtitle", a model that only
   works if it's pop-on subtitles or captions

   Nigel: Any other questions?

   Chris: No

  Shear calculations and origin of coordinate system. w3c/ttml2#1199

   Glenn: Status update on what I've been doing.
   … We recently finalised our implementation of line shear and
   block shear (tts:shear)
   … in the TTPE package. It's checked into a branch right now,
   possibly merged into the main branch.
   … We were able to verify the correct origin and orientation of
   the axes for both line shear and shear
   … in all of the writing modes in combination with different
   default paragraph level bidi levels.
   … That looks good.
   … One of the things we wanted to do was to resolve an issue
   Nigel had brought up
   … regarding processing of tts:shear semantics because in order
   to compute the adjustment
   … to the inline progression dimension (ipd) for doing line
   breaking, it is necessary to know
   … the value of the block progression dimension (bpd) that will
   be used for that adjustment.
   … The value of bpd may depend on having performed line
   breaking, so there is a potential
   … recursive process to resolve what the value of the bpd might
   be.
   … However after analysing the TTML specification semantics we
   realised that
   … bpd on a block area such as a paragraph is always defined in
   the sense that it has an initial value
   … which is auto, and at the present time, auto is defined such
   that it maps to 100%,
   … which means that the container area bpd in which the
   paragraph will be fitted constrains the
   … maximum value of the bpd, and in fact fixes it, because in
   all cases we can map
   … that back up to some region which is definite in its height
   and width and therefore bounded.
   … The long and short of it is that bpd = auto = 100% = bpd of
   the container area constrained by region size.
   … It can be no larger than the bpd of the region in which that
   p is placed.
   … The default semantics for doing shear calculation of the ipd
   can be determined ahead
   … of time when bpd = auto.
   … If bpd is set to some other value, e.g. an explicit length,
   or minContent, maxContent or fitContent,
   … which are defined in TTML2 but not used in IMSC, then other
   processes can be used to
   … determine the value of BPD and therefore plugged into the
   shear calculation to get the adjustment to ipd.
   … We were able to verify that and check that into our codebase,
   … and have entered it into the implementation report as having
   been implemented.
   … We added an expectation file in ttml2-tests for the TTPE
   output, so we
   … view that as having been resolved.
   … The next step is to update the spec as necessary.
   … Cyril has mentioned a couple:
   … Change sin theta to tan theta.
   … Add information about the origin and orientation of the axes
   for the purpose of performing the skew transformation.
   … I've started work on creating that update.
   … I plan to generate some SVG visuals that can go into the spec
   that
   … show the origin and axis for the different writing mode
   combinations wrt the paragraph directionality.
   … I expect that in the next few weeks.
   … We're trying to get all of the TTML2 tests implemented and
   checked into TTPE so that we have
   … resolved any issues in the tests and that will allow us to at
   least have one implementation
   … of every test that is listed in the implementation report.
   … Right now there are 3 tests left for us to complete, which
   should take 2-4 weeks approximately.

   Nigel: Thank you Glenn. Any questions?

   SUMMARY: @skynavga Glenn to continue working on specification
   pull request.

  Clarify if the first ISD must/may be constructed when empty
  w3c/ttml2#1232

   github: [10]https://github.com/w3c/ttml2/issues/1232


     [10] https://github.com/w3c/ttml2/issues/1232


   Glenn: I added a comment to the PR

   [11]Comment

     [11] https://github.com/w3c/ttml2/pull/1233#discussion_r650411506


   Glenn: pointing out that there is already text in the TTML
   element that makes the equivalence between
   … active document interval and root temporal extent. We already
   have established that,
   … it is just that this particular instance in this procedure
   should have the consistent language.
   … It is not introducing anything new or different in my
   opinion.
   … I'd like to see that move forward.

   Nigel: Am I correct that you're not happy with that Pierre?

   Pierre: If we are going to make that change we should
   rationalise the terms across the document
   … and really get to the bottom of what the term root temporal
   extent means.
   … I don't think we should make this change piecemeal.

   Glenn: I think this started because the wording "active
   document duration" appears and it is the only place where it
   appears
   … exactly like that. The intent here is simply to resolve that
   one issue.
   … It is clear that's what is meant here.

   Pierre: I don't think it is clear.
   … The term that has been used has been duration, now we're
   replacing it with extent.
   … I would like to know what root temporal extent means.

   Glenn: That boat has sailed.

   Pierre: I don't know, it's been ambiguous and we should say
   what it does.
   … It is not defined in the document, we're trying to clarify
   it.

   Glenn: Root temporal extent is defined as a term.

   Pierre: It is a circular definition. If we're clarifying it, we
   should say what it means or does.

   Glenn: The intent of this change is not to modify the define
   root temporal extent.

   Pierre: It actually changes the interpretation though.
   … My situation is to go back and rationalise what root temporal
   extent means.
   … We should not make piecemeal changes.

   Glenn: I find that quite interesting and wouldn't discourage
   anyone from undertaking such a project.
   … This particular issue is not predicated on reviewing the
   definition of root temporal extent.
   … If you think it is true I would like to see the argument.

   Nigel: This has been discussed before. It would be good to
   explain why this procedure depends on the term
   … root temporal extent and defines it, which is circular.

   Pierre: The [scribe missed this - apologies]

   Glenn: The root temporal extent is defined by the document
   processing context.

   Pierre: It's never clear to me how there can be an implicit
   duration but no implicit begin and end.

   Glenn: This goes back to the semantics of SMIL which make use
   of the term implicit duration in a highly technical manner.
   … We have used that definition in the context of TTML.
   … SMIL does not (I don't recall) define an implicit begin or
   end and we did not do that.
   … That sounds like a new work item/requirement that is not on
   our docket right now.
   … I think it is inappropriate to slip it into this PR - it may
   be an interesting question and possibly elaborate that
   … more in the definition of root temporal extent. But it is
   clear in the current language that we have
   … an equivalence statement in the specification of the tt
   element, so what this change proposes is simply
   … to make that usage consistent within the document because we
   had a case in the timing
   … semantics that did not define that properly.

   Pierre: By the way SMIL does define implicit end and implicit
   begin.

   Glenn: Thank you

   Pierre: Do they apply here?

   Glenn: That's outside the scope of this PR in my opinion.

   Pierre: That's my point, if we are tweaking or capturing the
   original intent of root temporal extent then we have
   … to get to the bottom of this.
   … My interest here is that there has been confusion here about
   what the active duration
   … of a TTML document is, if you try to render a document
   outside its active duration.

   Glenn: Durations have a fixed usage in TTML and SMIL that is
   independent of the begin and end points.
   … If you can resolve the begin and end then the difference is
   the active duration.
   … I still fail to see how you can interpret the current PR as
   an attempt to redefine the root temporal extent,
   … especially as we already have the statement that makes that
   equivalence.
   … If the phrases are different from the intended meaning in
   resolve timing, then I don't know what else it could be.
   … "Active time duration" sounds like a shorthand for that tt
   element definition.
   … So this change seems to make this more consistent rather than
   less so.

   Nigel: By the way that is my position as well.

   Glenn: If you think this is redefining root temporal extent I
   would like to see the argument for that.
   … It is not the intent, and if it were true then we would have
   to revisit the language in the tt element as well, which is
   … not in the scope of this issue.
   … I have no objection to revisiting and trying to fine tune the
   use of the term root temporal extent.

   Nigel: Thank you, we're running out of time. Anyone else have
   anything to add on this?
   … [no] - we need to work out a way to resolve.
   … I brought this to the group to try to work out how to get to
   consensus on the PR.

   Pierre: Maybe we're closer than you think - remove the note,
   and take the "i.e." out, but ultimately the
   … root temporal extent is application specified.

   Nigel: Thank you, please could you comment on the pull request
   so we can end the call?

   Pierre: Happy to stay on and discuss further if you have time.

   SUMMARY: Nigel, Pierre and Glenn to continue discussions.

  Meeting close

   Nigel: Thanks everyone, let's adjourn for today. [adjourns
   meeting]


    Minutes manually created (not a transcript), formatted by
    [12]scribe.perl version 136 (Thu May 27 13:50:24 2021 UTC).

     [12] https://w3c.github.io/scribe2/scribedoc.html

Received on Thursday, 24 June 2021 17:29:07 UTC