{minutes} Geneva F2F day 1 16/09/2014 from Nigel Megitt on 2014-09-17 (public-texttracks@w3.org from September 2014)

From: Nigel Megitt <nigel.megitt@bbc.co.uk>
Date: Wed, 17 Sep 2014 02:08:44 +0000
To: TTWG <public-tt@w3.org>, "public-texttracks@w3.org" <public-texttracks@w3.org>
Message-ID: <5941EAB8802D6745A7D363D7B37BD1F749B68051@BGB01XUD1012.national.core.bbc.co.uk>

Minutes from today's joint TTWG/TTCG meeting (day 1 of 2) available in HTML format at http://www.w3.org/2014/09/16-tt-minutes.html

In text format:

[1]W3C

[1] http://www.w3.org/

- DRAFT -

Timed Text Working Group Teleconference

16 Sep 2014

See also: [2]IRC log

[2] http://www.w3.org/2014/09/16-tt-irc

Attendees

Present
elindstrom, tmichel, Frans_EBU, pal, Cyril, courtney,
andreas, glenn, nigel, Loretta

Regrets
Chair
nigel

Scribe
nigel

Contents

* [3]Topics
1. [4]Introductions
2. [5]Agenda
3. [6]Work done so far
4. [7]Logical step through
5. [8]Agenda
6. [9]Document Structure
7. [10]Layout
8. [11]Summary of the day
* [12]Summary of Action Items
__________________________________________________________

<trackbot> Date: 16 September 2014

[13]https://www.w3.org/wiki/TimedText/geneva2014#Day_1_0900-170
0

[13] https://www.w3.org/wiki/TimedText/geneva2014#Day_1_0900-1700

Introductions

<scribe> scribeNick: nigel

Introductions - Nigel, BBC

andreas: IRT

Cyril: Telecom ParisTech university; GPAC

elindstrom: Opera software

zcorpan: Opera software

tmichel: W3C, staff contact for the TTWG

pal: Movielabs

courtney: Apple

glenn: Representing various over the years, currently Cox,
previously Samsung and Microsoft

frans_EBU: Coordinator of EBU group on subtitling

Agenda

nigel: goes through agenda on wiki page, all happy with that.
... We need to think about how we capture our output, and who
will edit the note.

courtney: I'm happy to edit the note.
... I don't have a document yet, I've been working on the code
first, and have some issues to tackle, and a spreadsheet for
attributes.

glenn: For browser implementations mapping direct from TTML to
HTML would be more efficient
... If the purpose is for direct display then this mapping
would be better, but if we want to interchange to WebVTT then
that translation would still be useful.

courtney: I'm interested in captions both inside and outside
browser environments so I'm not focused on HTML solely.

andreas: From the mapping we have done we will quite quickly
see the overlap - maybe there's a cut and paste into HTML as
glenn mentioned.

pal: Re WebVTT outside browsers?

courtney: Yes, e.g. in an ISO MP4 file that is rendered in a
video player.

pal: So do we need CSS in practice? To present WebVTT in
subtitles and captions?

courtney: You certainly can, but it depends on how fancy you
want to be. You can do basic 608 without CSS.

andreas: you need CSS to do colours, and that's certainly
required in Europe.

courtney: We define for example a simple mapping from CSS to a
property list. I think the better approach is to stick with CSS
and
... have a way to embed it in an MP4 file track, and also in a
WebVTT file.

pal: Will the mapping we do today include that?

courtney: yes

Cyril: +1

courtney: I've been thinking that one TTML file will map to a
WebVTT file + a CSS file

glenn: That's what I've been thinking, and there's a reusable
overlap into HTML/CSS

nigel: I've created a wiki page at
[14]https://www.w3.org/wiki/TimedText/TTMLtoWebVTT

[14] https://www.w3.org/wiki/TimedText/TTMLtoWebVTT

<zcorpan> can you paste the number in irc?

Work done so far

andreas: presents work so far
... This work has been supported by the HBB4ALL project, whose
target is to roll out accessibility to IP connected devices,
including subtitles, signing and audio (video) description,
... with a focus on hybrid broadcast.
... This is based on EBU-TT-BasicDe as a very restricted TTML
feature set.
... In fact it's a subset of EBU-TT-D which is a subset of TTML
plus a couple of small extensions.
... It has a video frame with a safe area, 10% in from each
edge.
... Alignment is top or bottom only, vertically.
... Horizontally, centred, left or right.
... For Germany, it's left to right, top to bottom writing
direction.
... There are 8 different text foreground colours, as from
WSTeletext.
... All subtitles have the same background color, font-family,
font-size and line height.
... Line breaking is done manually with the element at
authoring.

glenn: How is the background padding extended on either side of
the text?

andreas: That's just in the example image, it's not actually
present.
... How is this mapping achieved? Positioning, Styling, Timing.
... Positioning:
... [shows video frame with image of Verona]
... In TTML and EBU-TT there's a root container. In EBU-TT it's
always the height and width of the video. WebVTT uses the
viewport concept,
... which I understand to be the height and width of the video
also.
... For the safe area, we define the tt:region, with top-left
being 10% 10% in x y as specified by the origin.
... The CSS property is topleft
... The extent is 80% 80%, which in CSS is the width and height
of the block level element eg the div
... To place a subtitle the region is defined once in the head
and then referenced by the tt:p element. This is similar to a p
in html.
... The paragraph gets the width of the region, and the height
is calculated by the number of lines inside the p element.
... Vertical alignment is displayAlign: bottom or top.

nigel: Will there be CSS mappings for all of these in this
presentation?

andreas: This is setting out the features to map, we should
consider them in scope for our mapping later.
... I didn't use the advanced concepts in WebVTT of cue
alignment, so I didn't use them. I wanted something that would
certainly work in current browsers.
... In WebVTT I've put the cues in for the text. For a width of
80% the cue box has size: 80%
... The height is defined by the number of lines, just like the
p element.
... This is per cue, so the settings seem to need to be
repeated every time. I don't know a way to define it once and
have it carried through.

courtney: If you use a region you can do that.

andreas: I didn't use a region.

courtney: Then you have to repeat it.

andreas: So that's positioning. We can define the position of
the box from the left of the video frame, with 10%, using
position:10% align:left
... The align setting is important. It works very differently
than in TTML e.g. if you set align:middle and position:10% then
the reference point for the middle isn't the cue
... start but is the middle of the cue.
... So to centre the text then you have position:50%
align:middle
... For vertical alignment it's a bit trickier. To come 10% up
from the bottom you can set line:90% or a line number value.
... But this doesn't align the end of the cue box, but aligns
the top of the cue box. So that doesn't work.
... What you actually need is position:100% - margin - height
of cue-box.
... That works if you have a lot of control over the font
height and can calculate the position this way.
... In most cases that's a bit risky. So then I changed to the
other possibility, to use line alignment
... The first line in the cue generates the line grid, then you
can position the cue box with positive line numbers from the
top
... or negative line numbers starting from -1 from the bottom.
... [example shows text one line up from bottom]
... You have to have the snap to lines flag set - this happens
automatically if you use line numbers.
... For one line you can have line:-2, or for a two line
subtitle, line:-3. Needs a bit of calculation.
... A dirty trick possibly is always to set it to -1 and let
the renderer push it up. Possibly this is not recommended but
it may work.
... Styling:
... In EBU-TT-BasicDE there's a default style defined once in
the head, and a div element that references the defaultStyle.
... In WebVTT you can define a general cue selector ::cue and
use almost the same property names and values.
... For font-size some calculation is needed. 60% font size in
TTML comes out at 5.33% of the height of the video, which is
100% in CSS.
... A separate CSS file is needed to contain the ::cue
selector.
... For inline styles in TTML we set the colour attributes on a
style referenced from a span.
... In CSS you can use the pseudo-selector ::cue(c.textWhite) {
color: #ffffff; background-color:rgba(0,0,0,0.7); }
... Then in the VTT c.textWhite cue class
... Timing:
... In TTML put a begin and end on, with media timeBase,
reference sync is zero. In EBU-TT-BasicDE the fractional
seconds are limited to 3 digits.
... This is the same for WebVTT cues.

pal: What are the rules for CSS styles when combined with
locally set rules? Which takes precedence between author and
user choices?

courtney: We would consider user choices to override author
styles.

pal: If you're displaying it on a web page, then web styles
taking over seems like not the right thing to do.

andreas: It's not clear to me how the CSS that applies to the
web page interacts with the VTT cues. From testing there's no
relationship.
... The video is a separate viewport with independent styling,
from my testing anyway.

Cyril: I think that's not expected. I remember that the cues
are sourced in the HTML page so the styles should be applied.

andreas: I tried it out in Opera.

zcorpan: The styling was implemented in presto - I'll put
together a quick demo and paste the link

andreas: One important point is that we put the background
color just behind the text not the box. From what I read
there's no possibility
... in WebVTT to put the background only on block level
elements, e.g. the whole region/p/div etc.
... It only puts the background behind each glyph. I think
there's a WebVTT background box concept but it doesn't seem to
apply to the block level.

glenn: So TTML allows the background to be specified on the
containing block and possibly differently on the span or the p
within the larger block.
... So this example (showing two spans each with its own
background color) wouldn't be possible?

andreas: That's right. In Europe both possibilities are in use.
... We need to be aware of this restriction in the mapping.

<zcorpan>
[15]http://w3c-test.org/webvtt/rendering/cues-with-video/proces
sing-model/basic.html has styling

[15] http://w3c-test.org/webvtt/rendering/cues-with-video/processing-model/basic.html

zcorpan: This shows how a stylesheet applies to WebVTT cues -
the stylesheet is in the HTML page and the cues use those
styles
... There's a white video behind it.

pause for 4 minutes, back at 10:33 (CET)

<zcorpan> wrt to the positioning discussion, there are open
bugs on the webvtt spec for both changing how positioning works
and for adding something that allows for exact positioning.
[16]https://www.w3.org/Bugs/Public/buglist.cgi?quicksearch=webv
tt%20positioning&list_id=43983

[16] https://www.w3.org/Bugs/Public/buglist.cgi?quicksearch=webvtt%20positioning&list_id=43983

<zcorpan>
[17]https://www.w3.org/Bugs/Public/show_bug.cgi?id=25632

[17] https://www.w3.org/Bugs/Public/show_bug.cgi?id=25632

nigel: we're reassembling...

courtney: Here's what I've discovered from writing mapping
code.
... There's an issue that we don't have an official WebVTT spec
yet - we're working off drafts that aren't versioned.
... When Andreas was talking he was using browser supported
features. This is causing a bit of an issue. The mapping I've
been doing is off the most
... current WebVTT spec version.
[18]http://dev.w3.org/html5/webvtt/
... Here are 3 categories of issue:
... 1. TTMl is more hierarchical than WebVTT
... 2. The two specs define different properties implicitly vs
explicitly.

[18] http://dev.w3.org/html5/webvtt/

3. The basic problem of converting units (value type
conversions)

scribe: Hierarchical vs Flat:
... WebVTT has a flat structure with no nested elements. TTML
provides a hierarchical structure.
... Metadata: in TTML you can nest metadata hierarchically
[shows ttm:agent holmes and Dr Watson]. In WebVTT you get a
list with no relationships between them.
... Proposal for WebVTT is hierarchical metadata keys

nigel: Is that just metadata or presentation issues too?

courtney: It may be less of an issue for presentation issues
but there are cases where we run into a similar problem.
... Another example: Calculating relative timings
hierarchically in TTML and linearly in WebVTT.

Cyril: I think some profiles restrict that.

andreas: Yes, EBU-TT-D doesn't allow nested timing.

Cyril: That raises the question which profile are we looking
at?

Courtney: Yes, we can simplify the problem by specifying a
profile.

glenn: It's useful, though it may take longer, to start from
the general case and identify where in the absence of a profile
there are issues.
... For example re timing and even styles we could define a
mapping based on the sequence of Intermediate Synchronic
Documents, to remove the timing issues.
... Just documenting these issues is useful.

nigel: We decided last week to use TTML1SE and WebVTT.

andreas: for styling there's some hierarchical structure in
WebVTT too, by application of class nodes that are nested.

courtney: Yes you can have nested styles within a cue but if
you want the same style for 10 cues you can't put them in a
fragment and declare it at the fragment level.
... Implicit vs Explicit:
... Some functionality is explicitly described by attributes or
parameters in one spec but implicitly derived in the other.
... For example, horizontal writing direction. In TTML there's
a way to specify horizontal direction but in WebVTT there isn't
(unless it's vertical) - it's inferred from the font.

glenn: tts:direction is designed to work in relation to the
Unicode bidi control characters
... absent of those you can still infer directionality based on
the content of the element, though it's harder with mixed
content.
... So the direction attribute in TTML doesn't really say
'write right to left' but does specify the default writing
direction in the absence of bidi.

courtney: WebVTT has bidi too, and rtl and ltr entities.

andreas: In Unicode the information is already there.

glenn: You have to look at the history of Unicode - people
didn't want to use nestable control codes so they wanted CSS
attributes to do the same thing.

<zcorpan>
[19]http://dev.w3.org/html5/webvtt/#h4_processing-model says
how to determine direction

[19] http://dev.w3.org/html5/webvtt/#h4_processing-model

zcorpan: The horizontal direction is taken from the text in the
cue, not from the font (in WebVTT)
... You can override it with unicode bidi characters if you
want.

nigel: Seems like there's no issue to log in our issues list.

<zcorpan> "Apply the Unicode Bidirectional Algorithm's
Paragraph Level steps to the concatenation of the values of
each WebVTT Text Object in nodes, in a pre-order, depth-first
traversal, excluding WebVTT Ruby Text Objects and their
descendants, to determine the paragraph embedding level of the
first Unicode paragraph of the cue. [BIDI]"

glenn: TTML has the CSS features as well as the plain text.

courtney: Example 2: line breaks - need to be explicit in TTML
but can be just new lines in WebVTT.

Cyril: That's due to the parser - XML requires this.

andreas: Later on we can look at xml:space attributes. From the
tests I've seen with xml:space="preserve" then line breaks
should be preserved.

<zcorpan> XML doesn't require it really

glenn: In XSL-FO there are 4 different properties. We define an
explicit mapping of xml:space to sets of those values, in TTML.
We didn't expose the full XSL-FO model.

courtney: Value Type Conversions

<glenn> tnx 4 reminder

courtney: Example 1 - times
... TTML has different time expressions, WebVTT always has
hh:mm:ss.sss with fractional seconds.
... Fortunately the ttp: namespace defines all the required
metadata to do the conversions.
... Though I'm not sure that's the case with lengths and
position values
... Again TTML allows a broader set of units - pixels, em,
cells, %ages
... I'm assuming lineHeight is sort of like em. For some TTML
documents I think you need the authored video dimensions to do
the mapping.

pal: I think if you use %age or c you don't need the video
dimensions. If you're going to use pixels then implementations
should use tts:extent on the root as well.

glenn: By specifying extent on the root you can derive a pixel
dimension - this doesn't tell you the pixel relationship to the
video though.

andreas: An issue is that in general the root container pixel
dimensions are not necessarily coincident with the video
dimensions.
... The document has no way to specify this in TTML, in
general.

pal: CFF-TT and EBU-TT-D relate the root container to the
video. IMSC introduces an aspect ratio. All the profiles
specify how the mapping goes.

andreas: For general TTML documents this is an issue.

courtney: Attribute mappings
... Some are straightforward.
... Though WebVTT IDs can be purely numeric, and xml:id doesn't
allow that. So some modification or convention may be needed,
e.g. "cue"+number.
... We could define the best practice.
... Both use BCP47 language values
... Preserve space needs further discussion.
... Styling attributes: colors, fonts etc are fairly
straightforward.

pal: Is there a subset of CSS that's supported for WebVTT?

<zcorpan> [20]http://dev.w3.org/html5/webvtt/#css-extensions is
the subset

[20] http://dev.w3.org/html5/webvtt/#css-extensions

andreas: In WebVTT there's a subset of properties that are
permitted. E.g. padding is not allowed.

courtney: One requirement set is what's needed for CEA608. It
would be useful to have a standard set of CSS classes that can
be used for any CEA608 translations into WebVTT.
... There are some properties with no WebVTT equivalent:
display, overflow, padding, showBackground.
... For alignment, displayAlign maps to the latest version of
the WebVTT spec.

andreas: I tried it out, and it would work perfectly.

courtney: But they're not widely supported yet. The mapping is
nicer at least.

<zcorpan> "the properties corresponding to the 'background'
shorthand" is allowed, if that is what showBackground does

zcorpan: any other properties will be ignored than those listed
in the spec.
... I'm not sure how the TTML features map to those but there
is a defined subset in the spec.

courtney: To expand on that, things like textDecoration in TTML
you can have underline set on a cue, but for the rest of it
you'd have to go to CSS to do?

zcorpan: For underline you can use CSS or the element
inside a cue.

courtney: visibility and zIndex - I can't see how to do those
in WebVTT.
... extent can be done with a cue box size or a region size.
... A lot of the timing in the ttp: namespace metadata doesn't
map to the WebVTT because the timing that's allowed is a lot
simpler.

zcorpan: visibility and zIndex is not possible in WebVTT.

nigel: can't you do visibility with opacity?

zcorpan: yes you can do visibility.

courtney: there are also the attributes "use", "value" and
"type".

glenn: Those are in the profile definition mechanism - they're
not content or style based.

Cyril: does this mean they don't have to be mapped?

courtney: since there are no profiles in WebVTT I guess not.

glenn: This is all part of the TTML way to specify what a
processor needs to support, based on SMIL and SVG originally.
... I think it can probably be ignored but needs more thought.

andreas: If we do not find a direct mapping between WebVTT and
TTML that doesn't mean that we can rule it out for the mapping
... because there's some intent in the source document and we
have to check if theres something that needs to be done.

courtney: Ruby: there's no simple mapping from WebVTT to TTML
for ruby.

glenn: In TTML1 you have to do the work at authoring time and
use regions to place the ruby in the right place.
... I've recently specified in TTML2 the ruby markup.

Cyril: There may be several ways to define the same thing, so
we should try to use a canonical representation as the mapping
source.
... For example there are several ways of expressing timing -
maybe a requirement before mapping is a single syntax. I'm not
sure if this is possible.

courtney: it may be an interesting way to break the problem up.

Cyril: A problem I've seen before is that when attributes need
to be resolved at runtime based on context, e.g. frame rate,
video size etc there's not much that can be done.
... We maybe need to classify those attributes that can be
mapped offline vs those that need full context to resolve.

courtney: that's my presentation.

Cyril: There's also the question of which TTML profile to use.
But also there are different classes of WebVTT: valid or not?
parsable or not?
... Invalid documents may be presented okay by browsers. We
should say which class we're looking at.
... Then WebVTT can represent metadata, chapters, subtitles,
captions etc. so we should indicate which ones we're mapping,
if not all.

Logical step through

nigel: Processing model

Cyril: how does TTML handle overlapping times?

glenn: there's arbitrary overlap permitted.
... The first step I'd advocate is to create the intermediate
synchronic documents and map to WebVTT.

Cyril: In WebVTT there's the concept of cues becoming active
and then bumping up existing visible cues.

some discussion of how this is handled in TTML

andreas: Formally the concept of creating the ISDs makes a lot
of sense - we need to make sure everyone understands what that
means.

glenn: I agree. For example one thing that may not be obvious
is that style inheritance is only defined on ISDs so one has to
perform the ISD creation prior to style inheritance.
... I've also added a function on the TTV tool to generate the
set of ISDs.

nigel: We have a choice here to map ISDs or specific bits of
cue text.
... This impacts efficiency and metadata.

pal: This depends on the use case - if we just have the goal of
getting equivalent presentation then efficiency and metadata
are secondary concerns.

elindstrom: from a browser perspective we're interested in
accurate presentation.

courtney: I've been thinking about it the opposite way - from a
TTML to WebVTT conversion preserving semantics.

andreas: Would it be possible to take Courtney's attribute list
and make it a structured document, take it as a header, explain
the problem scenario,
... and indicate what the options and recommendations are from
the WG?
... If you try to map abstractly the logical model then it's
very hard. Something more concrete may be a better start.

pal: This is a question of how complicated we want to make it -
I haven't heard of anyone wanting to use WebVTT as a
master/archive/mezzanine format.

glenn: There's a use case for distribution though.

pal: I can see the use case of converting the TTML experience
into a WebVTT experience.

glenn: Part of this may be timing oriented in the sense that
user agents may potentially add TTML renderers directly, which
would reduce the future needs.
... But there may still be WebVTT-only presentation devices.

pal: The issue for me is about the non-presentation-based usage
of WebVTT.

elindstrom: I don't expect that to be a huge use case.

nigel: Seems like we've been considering TTML -> WebVTT here.
Does the same consideration apply the other way?

courtney: WebVTT does roll-up - I'm not sure how we do that
with TTML.

glenn: we may need to consider using the set element in TTML1.

pal: When you say roll-up you mean where there's an animation
displayed?

glenn: yes, gradually moving up.

pal: To do that explicitly in TTML you need animation, but what
is possible is to have a region that contains line A at t=0 and
at t=1 line B is added, moving line A up.
... This doesn't require any animation.

glenn: Yes correct but it doesn't do the whole 608 animation.

pal: Then the question is do we need to explicitly define the
roll-up animation.

glenn: Yes, we put in a note that implementation might do that.

courtney: What about paint-on?

glenn: That's no problem.
... Does WebVTT support smooth roll-up as opposed to discrete
line based roll-up?

courtney: I think it does yes, I'll have to confirm.

nigel: As a general point here we can leave it open to the
converter where it's left unstated in the source spec.

courtney: There's a scroll setting on the region in WebVTT that
specifies this.

nigel: Is there anything else regarding processing model that
may affect how we do the conversions?
... So far we have: ISDs, smooth vs discrete scrolling.
... I guess discontinuous markerMode in TTML may be
non-mappable too.

glenn: I've been thinking about this too - I think it would be
modelled by playing back the related media that triggers the
discontinuous smpte events and recording the
... elapsed time to make a conversion from discontinuous to
continuous.
... There's also the clock based timing which is also
interesting! In appendix N we mapped all the timing models to a
potentially continuous timeline.

nigel: I think we should exclude discontinuous marker mode and
maybe clock mode too, as being non-mappable from TTML1 to
WebVTT.

glenn: I think there may be some TTML2 work that can support
this.

nigel: I propose to make our mapping explicitly related to
TTML1 and if there's anything that helps in TTML2 we can update
it later.

glenn: Or we can simply reference the ISD creation process.

<zcorpan> "If region's text track region scroll setting is 'up'
and region already has one child, set region's
'transition-property' to 'top' and 'transition-duration' to
'0.433s'." - smooth rollup in webvtt with scroll:up.
[21]http://dev.w3.org/html5/webvtt/#h4_processing-model

[21] http://dev.w3.org/html5/webvtt/#h4_processing-model

nigel: Maybe we can do both, and reference the ISD generation
process and make a note that in TTML 1 the process isn't
defined in a way that facilitates
... conversion to WebVTT for discontinuous and clock mode
times.

courtney: If we refer to ISD conversion rather than TTML1
what's the reference document?

glenn: I'm working on this for TTML2.

courtney: Is there a draft document to refer to?

andreas: If you make the ISD concept central to the mapping it
must be fully elaborated so that everyone can understand it.

glenn: I agree but I think there's no way to avoid it other
than to create an alternative flavour of the same thing.
... This is the only way to solve the timing hierarchy problem.
... It also gets around the style inheritance process.

andreas: Formally I agree but it's hard to communicate the ISD
- it wouldn't be a valid TTML document. So the converter
wouldn't be from TTML.

glenn: We do have examples of ISDs in the TTML1 spec, which is
something I'm adding in TTML2.

andreas: ISD creation is specified in TTML1 so I think we can
use what's there. Is anything else needed?

glenn: Yes, the only thing absent is the specification of a
serialised form. We only used ISDs as a didactic construct for
explaining the formatting model.
... In TTML2 I plan to make interchange of ISDs possible in a
standard way.
... It would also be useful for this exercise. Now I have an
implementation already those things combine to make this
progressable.

pal: For mapping can we simply assume that an ISD is a valid
TTML document that happens to be static?

glenn: almost - it's not quite the same because there's some
transformation, e.g. the body element is copied and reparented
to the region elements that are temporally active.

courtney: My feeling is that this is just trading off one set
of problems for another.

pal: I was hoping that ISD could just be used to mean 'the
state of a TTML document between successive events".

Cyril: do we have a presentation on ISDs?

glenn: No, though I could do it verbally.

andreas: Maybe if it's in the TTV software we could have a look
at some simple examples?
... So we don't get stuck here, can we start on attribute
mappings that have to be done either way?

courtney: I'd prefer to stick with TTML rather than ISDs and
defer some of these problems.

nigel: +1. Most of the problems are just about timing.

glenn: Unfortunately that's not true - there's also the problem
that associates content with regions and then performing region
style inheritance.
... In the ISD document the content has been associated with
individual regions and then region style inheritance, and if
you don't go through the ISD process then the latter breaks.

nigel: I think you can do the style computation without making
the ISD.

glenn: There's a risk of duplication of effort.

courtney: I think you can map directly.

nigel: I want to defer timing issues to ISDs and do everything
else directly.

glenn: To be clear I didn't mean previously that we need to
serialise the ISDs

Cyril: We talked earlier about categories - we need to think
about metadata etc.

pal: I've not heard those use cases.

Cyril: Can we assume that metadata-only WebVTT files are out of
scope of this?

glenn: I guess the issue is searchability - if there are use
cases that need searchability e.g. characters, roles, other
agents, then we might need to consider that.
... If we're strictly talking about presentation than maybe we
don't need to consider that.
... In WebVTT can you use metadata to define larger classes for
presentation?

courtney: The only thing I've encountered along those lines is
voice, which may be one example. The approach I've taken is
just to map what is possible to map.
... In the document we can describe what's well defined and
note what can't be supported.

andreas: I agree - we should publish something sooner and limit
certain parts to a canonical representation if there are
multiple ways to express the same thing.
... We can decide on a feature by feature basis what to limit,
for example.

Cyril: we didnt talk about which mapping direction we're
talking about.

nigel: it's both.

andreas: Additionally there are, e.g. in Germany, cases where
browsers aren't used to present content, and renderers only
understand TTML.
... So we need to go both ways.

nigel: Adjourns for lunch - return at 1330 CET.

<zcorpan> i will call in 14:00. then 15:00-15:30 i will be
absent again

<zcorpan> correction. i will call in now but be absent between
14:00-14:30 and 15:00-15:30

trackbot, this is ttml

<trackbot> Sorry, nigel, I don't understand 'trackbot, this is
ttml'. Please refer to
<[22]http://www.w3.org/2005/06/tracker/irc> for help.

[22] http://www.w3.org/2005/06/tracker/irc%3E

trackbot, start meeting

<trackbot> Meeting: Timed Text Working Group Teleconference

<trackbot> Date: 16 September 2014

<scribe> chair: nigel

<scribe> scribeNick: nigel

Agenda

nigel: We may switch things around tomorrow due to changes to
flights etc.

We will capture output at
[23]https://www.w3.org/wiki/TimedText/TTMLtoWebVTT where I
enter 'wiki' in the minutes

[23] https://www.w3.org/wiki/TimedText/TTMLtoWebVTT

Document Structure

courtney: Can we go through TTML elements?

Cyril: can we map the tt element to the top of a WebVTT
document?

glenn: explains TTML structure down to style attributes.

Cyril: Suggests defining a style class in WebVTT corresponding
to each style in TTML

glenn: yes, we can do this.

courtney: Yes. Right now the CSS document is separate, but in
the future it could be embedded.

pal: Will there be feedback into WebVTT from this?

courtney: There are competing desires here - yes, in principle.

Cyril: can we go through these?

glenn: Let's keep going with structure.
... Takes group through region properties - including style
attributes for origin and extent, and referential approach.
... Each region has an id. If there are no regions defined
there's a default, covering 100%.

Cyril: How different is this from WebVTT regions?

courtney: WebVTT regions can not have styles, but the layout
information translates pretty directly.

glenn: For example tts:opacity is a region-specific property.
backgroundColor can apply to regions independently of the
content in the region.
... There are a number of style properties that only apply to
regions.

andreas: Can a region be compared to a div element in HTML?

glenn: yes.

andreas: So this is the only element that can be positioned
absolutely within the root container.

glenn: moves to body

Cyril: Will we have an output document structure with headers
and bodies, with two subsections - for styling and layout?

Courtney: yes.

Glenn: That's not a bad way to do it.

courtney: Part of this will describe the separate CSS and
WebVTT document.

glenn: takes us down through body, div, p and span.
... div can contain div; p can not contain p; p can not contain
div; div can not contain text.

Cyril: so p is equivalent to a cue?

courtney: seems that way.

glenn: Timing can be specified on body, p, div, p, span and br.

Cyril: cues can have nested timing in spans.

pal: is there a reason why each p can't map to a cue?

glenn: my mental model of a cue is that it is not overlapping
in time with other cues. I think this makes things easier.

pal: But if we can map a p to a cue then the mapping is
simpler.

courtney: What else would it map to?

glenn: Are you still assuming time has been flattened down and
sliced?

pal: Yes.

glenn: So there are no overlaps. At that point content that is
selected into regions is present and everything else has been
filtered out.
... every piece of content is associated with a single region
in TTML.

Cyril: same in WebVTT.

glenn: So the concept is to start from body, work down, and
associate each piece of content with a region.
... So if there's a region we're not interested in we can
filter out that content.
... So there may be multiple s all mapping into a single
cue.

courtney: With WebVTT you'd define regions, and for each cue
reference the region id.

glenn: That's exactly how it works in TTML but with the ability
to inherit region from an ancestor.

Cyril: So you can in principle flatten the TTML structure and
remove the <div>s.

glenn: You can't remove the <div>s because they specify breaks
and style.

Cyril: But you could propagate down.

nigel: You can paint the background of a div so if you remove
it then some information is lost.

andreas: is there a layout impact of div?

glenn: It implies a breaking boundary in the line progression
direction and it may contain styling.

group: discusses slicing apart divs into multiple s each of
which generates a cue.

Cyril: so if I start by resolving all the style references on a
p, flattening out all the styles, then...

glenn: so you can now enumerate all the s and <div>s and
assign each to a cue.

courtney: I think we should do that in the document.

glenn: Okay but you may end up with a lot of cues all with the
same timing. If there's no intrinsic limitation on that then we
can go ahead.

Cyril: Layout: so div affects layout?

glenn: Yes, divs can't (spatially) overlap each other within
the same region.

andreas: but the only fixed dimension defined is for the
region, so the height of each p and div depends on the content
flowed into them.
... So there's no difference between the block level boxes that
are generated by divs and ps.

Cyril: We could create artificial regions for divs that have a
background color

nigel: we may have some non-mappable functionality here, if a
region, a div, and a p all have different background colors.

glenn: Also if the div contains a div and both divs contain a
p, and all the background colors are different, then you end up
with different background paint areas

andreas: Can a div create a space that isn't occupied by a p?
If a p covers only 50% of the height of the region then its
parent div will just have the height of its contained s
... and not expand to the height of the region.

glenn: So it will have the same background color as the p

courtney: you can't specify an extent on a div or a p?

glenn: no that's right.

andreas: the width is defined by the region and the height by
the flowed in content.

Cyril: so you can't have a div with a different background
color from its child s?

glenn: That's right because we don't have a margin before or
after.

nigel: I think we've just resolved that s map to cues
(repeating Glenn's earlier joke)!

glenn: In TTML2 we have padding on content elements not just on
region, which might impact this, but thinking about it, it
should be okay because it's not margin.

courtney: What are content elements?

glenn: body, div, p, span, br.

Cyril: What if spans have timing that's shorter than their
parent p?

glenn: If there's an explicit end on the span that makes its
active end prior to the active end of its parent then it would
depend on the fill mode - it's either freeze or remove.
... I'd have to check what we said about this, from SMIL.

andreas: in WebVTT you can have non-ended cues, that last
until... when?

glenn: In TTML if there's an explicit end on the parent
container and the child ends prior to that then there would be
two ISDs, one
... covering the first period and the other covering the second
period, and the span wouldn't be present in the second period.

nigel: +1

Cyril: so you can have a span that contains text that activates
and deactivates part way through the cue.

glenn: Yes, that would be possible in TTML.

Cyril: Can we do that in WebVTT?

courtney: I don't think so - there's only styling changes part
way through a cue.
... So spans with time on them - would we have to separate them
into separate cues?

Cyril: I don't think that would work because they'd appear on
different lines.
... You'd have to go down to the ISD level.

nigel: Can you have spans with timing?

Cyril: only to switch the text on, not off.
... So not every p is a cue, it's a bit more complicated!

glenn: If you split everything into ISDs that do not overlap
then these problems can be resolved.
... We need to look more at the details and work out if there's
a problem here.
... The only thing we didn't cover is animation. There's a set
element in TTML1 that can also delineate ISD boundaries.
... In TTML2 we're adding continuous animation using the
animate element

In TTML2 ISDs there may be some internal animation within the
ISD.

andreas: it's also worth noting that every element can have
metadata attached.

glenn: metadata, except for the ttm:agent attribute which can
appear on any content element only, and the region, which
reference agent definitions in the header,
... other metadata elements are all local not referential.

andreas: TTML also allows child elements that are not in a TTML
namespace so it can be extended. A TTML processor is required
to prune these out and not reject
... the document. But it doesn't have to display.

courtney: Does anyone know if we can have metadata in CSS
within a style class?

andreas: you can have comments.

glenn: they're ignored in the CSS object model.

zcorpan: you can have custom properties that can be used for
any purpose including metadata.

<zcorpan> [24]http://dev.w3.org/csswg/css-variables/

[24] http://dev.w3.org/csswg/css-variables/

nigel: Can we go through the WebVTT structure and see how that
maps?

courtney: WebVTT files have a header section that starts with
WEBVTT

[25]http://dev.w3.org/html5/webvtt/

[25] http://dev.w3.org/html5/webvtt/

courtney: Then there can be metadata, such as language,
copyright etc.

Cyril: so when you parse the file, big objects are separated by
double line separators.
... Every piece of text separated by two lines is either a cue
or is a comment not for display.

andreas: but comments are not defined?

<zcorpan> [26]http://dev.w3.org/html5/webvtt/#webvtt-comments
comments are defined here

[26] http://dev.w3.org/html5/webvtt/#webvtt-comments

Cyril: no. For example in MP4 carriage you could remove it, or
put it in a previous or next segment - it won't be displayed.

courtney: In the header section you can also include region
definitions.

nigel: so you can't have untimed cues?

Cyril: yes. Can you in TTML?

nigel: yes you can - they have the duration of the whole
document (assuming there's no inherited time from a parent time
container etc)

Cyril: this is in flux in the WebVTT standard, using keywords
like 'Next' for 'until the next cue'.

glenn: during the conceptual ISD mapping process every piece of
content gets timed. Ultimately the active period of the related
media object will determine that time,
... in the absence of any other information.

andreas: We also have to think about multiple in TTML
documents, which are allowed, but shouldn't generate multiple
line breaks because they wouldn't
... be displayed in WebVTT.

Cyril: so you could define line numbering or put non-breaking
spaces on otherwise empty lines. I'm not sure how the
backgrounds would be painted for spaces.
... records issue on wiki

andreas: You can use empty spans on each line.

courtney: Identifiers are used - each cue can have an
identifier, which would show up before the begin and end time
lines.
... Also regions have ids that can be referenced in cues.

Cyril: Those cue ids come from SRT - in SRT each cue has to be
a monotonically increasing number with no gaps.
... it's very common to have WebVTT files with numeric
identifiers.

andreas: and the ids can have spaces in between, which isn't
permitted in xml:id

courtney: so we should have a convention for mapping to TTML
Ids.

nigel: Can VTT cue ids be duplicated?

courtney: no.

nigel: the reason for mentioning it is that if we do TTML ISD
-> Cue then the same TTML id may resolve to multiple cues.

courtney: there's something to think about here with slicing
VTT cues into time slices.

Cyrill: As long as all the spans in a p aligns with the end
times of the p then you can keep it as a single cue.

nigel: that's a special case - think of live word by word
subtitles.

Cyril: cues have to be laid out in start time order.
... Within a cue you can have internal timing values, that I
think also have to be in increasing time order (I'm not sure
about that).
... can you have TTML spans that display in reverse time order
compared to the document order?

glenn: Yes, there are no constraints.

Cyril: what about in profiles?

pal: I haven't seen any profile that constrains that out.

glenn: if the TTML time container is a par (parallel) time
container than a child can start after one of its preceding
siblings.
... the order in the content will define the order of
presentation order (spatially).

pal: IMSC 1 allows a document to be labelled progressively
decodable which forbids timing on descendants of s.

courtney: So that needs to be in the document, i.e. temporal
ordering within the document.

andreas: EBU-TT-D doesn't constrain this but recommends time
ordering. Most legacy formats are sequentially ordered in time
as well.

Cyril: even if the s were out of order in time that wouldn't
be a problem, but out of order s would be a problem.

pal: But going to ISD level would avoid that.

Cyril: adds this issue to the wiki

nigel: Do we have to worry about rtl direction when sorting
spans into order in WebVTT?

glenn: I would expect that when a span is active all text
content of active spans are merged and then directionality is
applied on the result.

courtney: let's leave the identifier mapping convention until
later.

nigel: Voice spans are straightforward aren't they?

courtney: I think voice maps to agent pretty well.

nigel: +1
... What about styling based on voice cue selectors?

courtney: You could define a TTML style for each agent.
... Along those lines you can put styling directly on a span -
in WebVTT I think you'd have to define CSS classes for those.

Cyril: you may not have to scan the whole document but could
create a random hash for every time one is encountered.
... I'm also interested in streaming, transcoding live streams.

glenn: If it's not been converted into an ISD sequence then you
can't avoid parsing the whole document (unless it's
progressively decodable).
... You never know if the last markup element will be timed
prior to the rest.

Cyril: WebVTT documents are always progressively decodable.
... go to example just before section 2 - this has multiple
lines in the header. In this case Regions, but it could be
copyright, anything else.
... So some parts of the header map to regions and others to
metadata.
... continuing on document structure.
... Each cue has a timestamp for start and end, followed by
optional settings.

Courtney: There are additional settings available.

Cyril: they are a combination of styling and layout.

nigel: What about at the end of the document?

Courtney: there's nothing to mark the ends of documents.

Cyril: that's a feature - you can concatenate two WebVTT files,
and if the timestamps obey the time rules then it's valid.
... The second header would be ignored.

pal: what about styles?

andreas: We also need to think about error handling -
processing of invalid documents.

nigel: Can we simply constrain our mapping to input documents
that are valid?

Cyril: maybe not - we could consider the WebVTT to TTML mapping
to do what a presentation processor would do when given an
invalid document
... The behaviour is well defined.

nigel: Let's take a break until 1545...

<zcorpan> re "nigel: Can VTT cue ids be duplicated?" - yes,
there is no requirement about uniqueness for cue identifiers.
however region identifiers need to be unique and don't allow
spaces

<zcorpan> hmm. sorry, looks like cue id requires uniqueness
also. i think that changed from a few years ago

<zcorpan> looks like the spec allows a cue id to be duplicated
as region id

Restarting...

Layout

andreas: We should start with the positioning of a element
relative to a region.

courtney: The positioning is the piece that will map into
WebVTT. There are several region attributes in TTML that can
not go in WebVTT.

group: discussion of xml:lang on <region> and how it may get
inherited by content elements in TTML.
... discussion of style attributes on region - which must be
included?

courtney: Maybe we should go through each attribute.

<tmichel> I just joined Zakim using SIP. It works for me using
code ttml#

<zcorpan> i still get "this passcode is not valid"

glenn: I have a list of style attributes that apply to region.
... there are 12 in TTML1, and of those, 9 apply only to
region.
... Styles that apply both to region and other content elements
are backgroundColor, display and visibility.
... the ones that apply only to region in TTML1 are
displayAlign, extent, opacity, origin, overflow, padding,
showBackground, writingMode and zIndex.
... Note that at least one of these will be opened up to
content elements in TTML2, which is padding.
... We may also open up opacity to content elements, which
would allow the definition of opacity for an element and its
content as a collection.

andreas: Should we rule out the attributes that will change in
TTML2?

glenn: In fact opacity and padding are extended to all content
elements in TTML2.
... In both cases they aren't being removed from region, so
they are still applicable to region in ttml2.

courtney: So let's start with those. I believe that only 3 map
to a region in WebVTT: displayAlign, extent and origin.

andreas: And they can be mapped to properties of the region?

nigel: can't you do visibility by setting a style with opacity
zero?

courtney: you can do that but only on a cue, not on a region.

nigel: So another way to say the same thing is that there's no
region selector for styling?

courtney: Yes.

nigel: does the lack of zIndex imply that in WebVTT overlapping
regions are prohibited?

courtney: I don't think they're prohibited.

glenn: In TTML2 on this subject we have a request for
expressing z ordering for content to be able to handle 3D.

pal: That sounds similar but it's a different concept.

Loretta: I'm trying to see if the magic layout algorithm
applies to region as well.
... In general there's no notion of zIndex in WebVTT.

nigel: Is there an alternative way to achieve backgroundColor
on regions in WebVTT?

courtney: I don't think so, you can only do it for cues.

Cyril: adds non-mappable showBackground on region and zIndex to
the wiki.

courtney: overflow is always hidden for regions too.

glenn: Can wrapping be prevented so that overflow may be
relevant?
... Or what happens if you put too much content into a region
i.e. too many lines?
... It sounds like extent, origin and displayAlign are
currently expressible. The other 9 attributes seem to be
absent.
... display seems to be only worthwhile in conjunction with
animate.

nigel: It seems that the pseudo classes past and future have
some relationship to animate.

andreas: Wants to note that when we finish on the TTML
attributes we should go the other way round.

courtney: Let's do the non-style attributes on a region
first...
... You can put timing on a region in TTML - there's no
equivalent in WebVTT. attributes begin, end, dur, timeContainer

glenn: timeContainer is on regions for the processing of
animate elements that are children of region.

<Loretta> Does the cue-region pseudo-element let us apply CSS
styles to regions?

nigel: What's the action on that - to add it to the
non-mappable list?

Cyril: why have timing on regions?

glenn: The main reason is to provide timing for background
painting when no content is active, and also to specify the
timing for animate elements that are children of that region.

Cyril: I'm not sure it's not mappable - you can have empty cues
applied to a region, with the equivalent times of the TTML
region.
... Then that would activate the region in the same way - what
happens then is a later question, e.g. background painting.

glenn: Actually the timing of a region in TTML can be used to
temporally clip the flow of content into that region, so it's a
bit more than that.
... The question really is: do implementations use animate?

pal: I'm going to check the examples I have.
... another thing is how do you achieve dynamic positioning for
text? One way is to create one region per subtitle.
... In that case you may be tempted to put the timing on the
region.

Loretta: What are you trying to do here?

pal: In TTML1 there's no per-cue positioning, e.g. of each .
One way to achieve that effect is to define one region per
subtitle and position each region
... individually.

andreas: From the layout perspective, there's a chance that
timings are put on region elements.

courtney: Shall we talk about the things that do map?
... On a WebVTT region the available settings are: width,
lines, region-anchor, viewport-anchor and scroll.
... I believe that extent in TTML maps to width and lines.
... We have the dimension issues for value units, e.g. if it's
in %age then it's okay but in pixels you need the size to do
the unit conversion.
... I think that displayAlign and origin in TTML, in
combination, map to a combination of regionAnchor and
viewportAnchor in WebVTT. The two specs have
... different ways to achieve the same thing. In WebVTT you
define a point within the video frame that maps to a point
within the region and they don't necessarily
... have to be the same thing. Origin + displayAlign allows you
to achieve the same effect.

nigel: I thought there was some freedom in WebVTT about the
precise positioning, whereas in TTML there's no freedom of
movement - is that right?

Loretta: I'm still wading through the WebVTT algorithm.
Certainly for cues things get moved around to be as close as
possible to the stated location.

nigel: Yes, I'm not sure if that applies to regions as well as
cues.

Loretta: Yes, I think it may do - I'm still checking.

courtney: I think we should take that offline and research it.

andreas: I see a problem with the lines value - this defines
the height of the region. A line is defined by the height of
the first line of the cue, so a region does not
... always have the same height, as it depends on the first
line's size. This is a hard topic to research in general, how
this will resolve.

nigel: What's a concrete example of that problem?

andreas: In general the mapping from TTML to WebVTT may not be
possible because for each cue selected into the same region the
line height could be different,
... which will result in the region changing height.

Loretta: presumably WebVTT would expand the region to
accommodate the 5 lines and TTML would clip?

glenn: That would depend on lineHeight, fontSize and overflow
attributes in TTML.
... Right now we don't have an object-fitting algorithm such as
in CSS.

Loretta: Is there a way of setting font-relative dimensions?

glenn: yes, they can be defined in ems or cells. Ems would be
font-relative.

andreas: Why is region height important for WebVTT when no
background can be drawn?

Loretta: the height is important because that determines when
scrolling will start.

nigel: This seems very similar to the overflow attribute in
TTML - if some lines fall out of a region, which ones should an
implementation hide?

glenn: That's an implementation issue.

andreas: Can you explain the difference between the region
anchor point and the viewport anchor point?

courtney: the region anchor setting defines a point that is
fixed in location relative to the region, in case the region
has to grow.
... the viewport anchor setting defines where in the video the
region must overlap.
... It needs to be understood in relation with the
display-align setting.

Loretta: right, we need two points. It's like sticking a pin
through the region and in the viewport, and any changes to
region size keep that point invariant.

courtney: the region viewport anchor setting has two points
defined, the point within the video and the point within the
region.
... Then there's an additional point that is held constant when
the region is resized.

ack

nigel: I think we need to understand the region mapping
algorithm from WebVTT - to origin and extent, and if that's a
single value or if there are multiple values,
... which in TTML we can do using set elements on the region.
... I think we need a strawman algorithm for this mapping so
that we can look at it.

andreas: I propose a gist on github for example.

courtney: I'll take it as an action item to come up with a
strawman proposal.

glenn: A moment ago I thought I heard something about origin
being in the centre in TTML - was that the question?

Courtney: yes, would you do that with displayAlign?

glenn: origin is always top left. You can use displayAlign to
define where lines are drawn from - in which direction. Right
now there's no anchor mechanism in TTML.
... Sean did come up with a change proposal, which I will have
to try to dig out.

courtney: It's always top left?

glenn: yes.

nigel: In scope terms, do we need to consider the placement of
text within regions, and also the placement of text not in
regions?

<glenn>
[27]https://www.w3.org/wiki/TTML/changeProposal015#region_ancho
r_points

[27] https://www.w3.org/wiki/TTML/changeProposal015#region_anchor_points

glenn: on the prior point, change proposal 15 has a section on
this.
... This is proposed for TTML2, but not implemented yet.

courtney: In WebVTT cues can have positioning - in TTML1 they
don't. So in the mapping to TTML we need to translate to a
region.

glenn: In TTML2 we are defining inline region definitions, so
div and p in TTML2 can take a child region element, including
extent and origin.

andreas: This is sometimes misused in operation!
... In mapping from WebVTT with no region and snap to lines is
active, from the WebVTT spec it looks like margins need to be
added top and bottom. Is that correct?
... If the first line is not to be at the bottom and the last
line must not be at the bottom, that is.
... We need clarifications of this for accurate mapping.
... will add to the Issues list on the wiki

Summary of the day

nigel: We've looked at existing work from Andreas and Courtney,
thought about the processing models and document structures,
... identified that style attributes should mostly transfer
straightforwardly, thought about metadata a bit, and spent a
while on layout.
... Tomorrow we have some time set aside for testing, and I
suggest we combine the test case generation with the mapping
algorithms.
... Thank you everyone, see you tomorrow.

adjourns meeting.

Summary of Action Items

[End of minutes]
__________________________________________________________

Minutes formatted by David Booth's [28]scribe.perl version
1.138 ([29]CVS log)
$Date: 2014-09-16 15:07:16 $
__________________________________________________________

[28] http://dev.w3.org/cvsweb/~checkout~/2002/scribe/scribedoc.htm
[29] http://dev.w3.org/cvsweb/2002/scribe/

Received on Wednesday, 17 September 2014 02:09:21 UTC