- From: Nigel Megitt <nigel.megitt@bbc.co.uk>
- Date: Thu, 24 Jun 2021 17:28:28 +0000
- To: TTWG <public-tt@w3.org>
- Message-ID: <1733F40B-6CC3-4E60-A717-2C2B2DE69DF4@bbc.co.uk>
Thanks all for attending today's TTWG meeting. Minutes can be found in HTML format at https://www.w3.org/2021/06/24-tt-minutes.html
In text format:
[1]W3C
[1] https://www.w3.org/
Timed Text Working Group Teleconference
24 June 2021
[2]Previous meeting. [3]Agenda. [4]IRC log.
[2] https://www.w3.org/2021/06/10-tt-minutes.html
[3] https://github.com/w3c/ttwg/issues/188
[4] https://www.w3.org/2021/06/24-tt-irc
Attendees
Present
Andreas, Atsushi, Chris_Needham, Gary, Glenn, Nigel,
Pierre
Regrets
Cyril
Chair
Nigel
Scribe
cpn, nigel
Contents
1. [5]This meeting
2. [6]How live delivery is handled in TTML/IMSC
3. [7]Shear calculations and origin of coordinate system.
w3c/ttml2#1199
4. [8]Clarify if the first ISD must/may be constructed when
empty w3c/ttml2#1232
5. [9]Meeting close
Meeting minutes
This meeting
Nigel: Today, we have a topic Gary requested, about handling
live delivery of TTML.
… We also have 2 issues on TTML2, which perhaps we can make
progress on.
… I have kept the IMSC HRM issue about spans on the agenda in
case there is anything to discuss.
Pierre: On the HRM thing, I haven't made much progress but I
think we should take 10 minutes to talk about strategy.
… How do we propagate HRM changes through IMSC 1.0.1, 1.1 and
1.2?
… Rather than going through the issues themselves.
Nigel: Ok, good idea
… Then we have an IMSC Test issue/pull request.
… In AOB we have TPAC 2021.
… Any other points to discuss or make sure we cover?
group: [none]
How live delivery is handled in TTML/IMSC
Nigel: This was asked by Gary - perhaps we should work out if
there is a quick answer or if we need a longer session.
Gary: This comes from the unbounded cue discussion last week.
… Partly because of your mention Nigel, one of the issues is
how unbounded cues work with live captioning.
… There is segmented captioning happening with WebVTT and HLS,
also done for live. I have an understanding of how that works,
… but not how TTML and IMSC does live. I suspect it might be
similar.
… Figured it would be similar, but wanted a sense of it so that
if there are any specific
… problems then we don't repeat the same issues.
Nigel: The first thing to note with TTML is that it's an
application layer on top of TTML
… You probably need constructs for streaming delivery around
TTML, provided by other things
… I know of 4 wrappers around TTML that do this: MPEG-2, MP4
DASH/HLS streaming, RTP, and the EBU-TT Live extensions
… They all provide some sort of time windowing of the TTML
document, and they send a sequence of TTML documents
… The only one that doesn't use a wrapper is EBU-TT Live
… The wrapper defines a time window with a beginning and a
duration that signals: from this point time onwards, this
single TTML document is active
… As a consequence of that, any previously active TTML document
is no longer active
… The last piece of the puzzle is to align the timelines
between the TTML payload content and the external timing
… There are established ways to do that, defined in the wrapper
… There could be an external timeline, e.g., an epoch such as 1
Jan 1970 and times are relative to that. Or each TTML document
starts at time zero relative to when document playback begins
… The question then is how to do it for live? Live captioning
is captioning of an audio stream that also has a separate
mechansim for encoding, packaging, and distribution
… The requirement, from an audience perspective, is to get the
captions in a way that's aligned with the audio. They could be
a little bit late, or co-timed with server side delays
… We generally don't have completely unbounded stuff. It is not
possible to issue documents for encoding, packaging,
distribution until you get to the end of the time it applies to
… You could have a single subtitle that begins at some time,
with no end time. It would have no end or dur attribute in the
document
… The application semantics would say that the document stops
being active and a new document becomes active
… If the TTML document appears unbounded in that case, the
application applies a bound
Pierre: They way modern packaging and streaming formats work,
the playlist is at a higher level than the TTML document. The
playlist sets the bounds
Gary: That's helpful. That aligns with what I figured, which
also applies to segmented WebVTT
… A question would be: do unbounded cues makes sense for TTML
too?
Nigel: If you want a semantic that says within the document
there's no end time for an element, it can already be done.
Simply omitting the end time does that
Gary: The end time is defined by how it's delivered to the
application. A cue with no end time or duration would be
forcefully bounded by the media segment it's embedded in, e.g.,
MP4
Nigel: Yes
… The last document stays active until you activate another
one. In a segmented MP4 context (DASH), it's generally
predefined when segment ends / segment durations will be
Pierre: I've not seen a requirement for unbounded cues yet
Gary: We'll continue with WebVTT on the use cases and deriving
requirements. If it make sense for live captioning to have
unbounded cues in webvtt, we could maybe also talk about
application in TTML
… It's still early. Not clear that having unbounded cues is a
requirement we want to proceed with
Nigel: Are we asking the wrong question? The conversation about
bounded/unbounded cues starts from an assumption that a cue is
a semantic object in its own right that a user can interact
with
… In the schemes I've talked about, there's not a requirement
to semantically identify a single piece of content as it
changes over time
… Instead, the focus is on delivering the right presentation by
delivering the documents
… If a client wants to do some analysis to identify duplicates
(for example), it's up to the application
… Having one thing that gets updated is a different semantic
model
Pierre: People talk about subtitle cues, there's a good mapping
with pop-on captions, e.g., on a DVD. It breaks down when you
talk about progressive subtitling, where words appear
additively, paint-on
… Or where lines appear in the same region. In those scenarios
the concept of a subtitle doesn't work at all
… There's no such thing as "a subtitle". The TTML model is text
flowed in a region
Gary: I think that for the document that Chris started we're
trying to separate high level use cases (e.g., live captioning)
from requirements - e.g., create unbounded cue so we can
deliver earlier, for example
… So we want the right use cases. The individual cues are less
important than knowing that we're capturing spoken word
Nigel: Have we answered the question?
Gary: It does for me
Chris: The question I have is around other kinds of metadata.
In WebVTT I think it's possible
… that you can annotate chapter points, or denote segmentation
of the content.
… In that case if you're starting a new chapter, which says
"this news segment just started"
… and it starts now and we don't know the end time,
… is there any equivalent model in TTML for that kind of use
case.
Pierre: Yes, absolutely. It's possible in TTML to specify that
an element has an undefined end time.
Chris: And then it becomes application specific how to
interpret that.
Pierre: What to do with it. The interpretation in TTML is
pretty unambiguous.
… Then do you leave it undefined or clip the presentation to
some value.
Nigel: There's nothing to stop you adding your own metadata to
an element, e.g., to indicate which chapter you're in
… That segmentation applies to content rather than there being
some other "thing" that has start or end time signalled
Pierre: As I understand it, WebVTT came from SRT, which came
from DVD subtitles. That's a really specific use case,
subtitles for translation. It's all pop-on
Gary: It has other mechanisms
Pierre: People say "a cue" or "a subtitle", a model that only
works if it's pop-on subtitles or captions
Nigel: Any other questions?
Chris: No
Shear calculations and origin of coordinate system. w3c/ttml2#1199
Glenn: Status update on what I've been doing.
… We recently finalised our implementation of line shear and
block shear (tts:shear)
… in the TTPE package. It's checked into a branch right now,
possibly merged into the main branch.
… We were able to verify the correct origin and orientation of
the axes for both line shear and shear
… in all of the writing modes in combination with different
default paragraph level bidi levels.
… That looks good.
… One of the things we wanted to do was to resolve an issue
Nigel had brought up
… regarding processing of tts:shear semantics because in order
to compute the adjustment
… to the inline progression dimension (ipd) for doing line
breaking, it is necessary to know
… the value of the block progression dimension (bpd) that will
be used for that adjustment.
… The value of bpd may depend on having performed line
breaking, so there is a potential
… recursive process to resolve what the value of the bpd might
be.
… However after analysing the TTML specification semantics we
realised that
… bpd on a block area such as a paragraph is always defined in
the sense that it has an initial value
… which is auto, and at the present time, auto is defined such
that it maps to 100%,
… which means that the container area bpd in which the
paragraph will be fitted constrains the
… maximum value of the bpd, and in fact fixes it, because in
all cases we can map
… that back up to some region which is definite in its height
and width and therefore bounded.
… The long and short of it is that bpd = auto = 100% = bpd of
the container area constrained by region size.
… It can be no larger than the bpd of the region in which that
p is placed.
… The default semantics for doing shear calculation of the ipd
can be determined ahead
… of time when bpd = auto.
… If bpd is set to some other value, e.g. an explicit length,
or minContent, maxContent or fitContent,
… which are defined in TTML2 but not used in IMSC, then other
processes can be used to
… determine the value of BPD and therefore plugged into the
shear calculation to get the adjustment to ipd.
… We were able to verify that and check that into our codebase,
… and have entered it into the implementation report as having
been implemented.
… We added an expectation file in ttml2-tests for the TTPE
output, so we
… view that as having been resolved.
… The next step is to update the spec as necessary.
… Cyril has mentioned a couple:
… Change sin theta to tan theta.
… Add information about the origin and orientation of the axes
for the purpose of performing the skew transformation.
… I've started work on creating that update.
… I plan to generate some SVG visuals that can go into the spec
that
… show the origin and axis for the different writing mode
combinations wrt the paragraph directionality.
… I expect that in the next few weeks.
… We're trying to get all of the TTML2 tests implemented and
checked into TTPE so that we have
… resolved any issues in the tests and that will allow us to at
least have one implementation
… of every test that is listed in the implementation report.
… Right now there are 3 tests left for us to complete, which
should take 2-4 weeks approximately.
Nigel: Thank you Glenn. Any questions?
SUMMARY: @skynavga Glenn to continue working on specification
pull request.
Clarify if the first ISD must/may be constructed when empty
w3c/ttml2#1232
github: [10]https://github.com/w3c/ttml2/issues/1232
[10] https://github.com/w3c/ttml2/issues/1232
Glenn: I added a comment to the PR
[11]Comment
[11] https://github.com/w3c/ttml2/pull/1233#discussion_r650411506
Glenn: pointing out that there is already text in the TTML
element that makes the equivalence between
… active document interval and root temporal extent. We already
have established that,
… it is just that this particular instance in this procedure
should have the consistent language.
… It is not introducing anything new or different in my
opinion.
… I'd like to see that move forward.
Nigel: Am I correct that you're not happy with that Pierre?
Pierre: If we are going to make that change we should
rationalise the terms across the document
… and really get to the bottom of what the term root temporal
extent means.
… I don't think we should make this change piecemeal.
Glenn: I think this started because the wording "active
document duration" appears and it is the only place where it
appears
… exactly like that. The intent here is simply to resolve that
one issue.
… It is clear that's what is meant here.
Pierre: I don't think it is clear.
… The term that has been used has been duration, now we're
replacing it with extent.
… I would like to know what root temporal extent means.
Glenn: That boat has sailed.
Pierre: I don't know, it's been ambiguous and we should say
what it does.
… It is not defined in the document, we're trying to clarify
it.
Glenn: Root temporal extent is defined as a term.
Pierre: It is a circular definition. If we're clarifying it, we
should say what it means or does.
Glenn: The intent of this change is not to modify the define
root temporal extent.
Pierre: It actually changes the interpretation though.
… My situation is to go back and rationalise what root temporal
extent means.
… We should not make piecemeal changes.
Glenn: I find that quite interesting and wouldn't discourage
anyone from undertaking such a project.
… This particular issue is not predicated on reviewing the
definition of root temporal extent.
… If you think it is true I would like to see the argument.
Nigel: This has been discussed before. It would be good to
explain why this procedure depends on the term
… root temporal extent and defines it, which is circular.
Pierre: The [scribe missed this - apologies]
Glenn: The root temporal extent is defined by the document
processing context.
Pierre: It's never clear to me how there can be an implicit
duration but no implicit begin and end.
Glenn: This goes back to the semantics of SMIL which make use
of the term implicit duration in a highly technical manner.
… We have used that definition in the context of TTML.
… SMIL does not (I don't recall) define an implicit begin or
end and we did not do that.
… That sounds like a new work item/requirement that is not on
our docket right now.
… I think it is inappropriate to slip it into this PR - it may
be an interesting question and possibly elaborate that
… more in the definition of root temporal extent. But it is
clear in the current language that we have
… an equivalence statement in the specification of the tt
element, so what this change proposes is simply
… to make that usage consistent within the document because we
had a case in the timing
… semantics that did not define that properly.
Pierre: By the way SMIL does define implicit end and implicit
begin.
Glenn: Thank you
Pierre: Do they apply here?
Glenn: That's outside the scope of this PR in my opinion.
Pierre: That's my point, if we are tweaking or capturing the
original intent of root temporal extent then we have
… to get to the bottom of this.
… My interest here is that there has been confusion here about
what the active duration
… of a TTML document is, if you try to render a document
outside its active duration.
Glenn: Durations have a fixed usage in TTML and SMIL that is
independent of the begin and end points.
… If you can resolve the begin and end then the difference is
the active duration.
… I still fail to see how you can interpret the current PR as
an attempt to redefine the root temporal extent,
… especially as we already have the statement that makes that
equivalence.
… If the phrases are different from the intended meaning in
resolve timing, then I don't know what else it could be.
… "Active time duration" sounds like a shorthand for that tt
element definition.
… So this change seems to make this more consistent rather than
less so.
Nigel: By the way that is my position as well.
Glenn: If you think this is redefining root temporal extent I
would like to see the argument for that.
… It is not the intent, and if it were true then we would have
to revisit the language in the tt element as well, which is
… not in the scope of this issue.
… I have no objection to revisiting and trying to fine tune the
use of the term root temporal extent.
Nigel: Thank you, we're running out of time. Anyone else have
anything to add on this?
… [no] - we need to work out a way to resolve.
… I brought this to the group to try to work out how to get to
consensus on the PR.
Pierre: Maybe we're closer than you think - remove the note,
and take the "i.e." out, but ultimately the
… root temporal extent is application specified.
Nigel: Thank you, please could you comment on the pull request
so we can end the call?
Pierre: Happy to stay on and discuss further if you have time.
SUMMARY: Nigel, Pierre and Glenn to continue discussions.
Meeting close
Nigel: Thanks everyone, let's adjourn for today. [adjourns
meeting]
Minutes manually created (not a transcript), formatted by
[12]scribe.perl version 136 (Thu May 27 13:50:24 2021 UTC).
[12] https://w3c.github.io/scribe2/scribedoc.html
Received on Thursday, 24 June 2021 17:29:07 UTC