Minutes of TT WG Meeting on February 4-6, 2004, Cambridge, MA

Minutes of TT WG Meeting on February 4-6, 2004, Cambridge, MA

Attendees

  Glenn Adams (XFSI, Chair, Scribe) [GA]
  Geoff Freed (WGBH/NCAM) [GF]
  Sean Hayes (MSFT) [SH] - phone only (Days 1-3)
  Erik Hodge (REAL) [EH] - phone only (Days 1-2)
  Dave Kirby (BBC) [DK]
  Thierry Michel (W3C) [TM]
  Dave Singer (Apple) [DS]

Regrets

  Mike Dolan (Invited Expert)

************************************************************************
Agenda
************************************************************************

Day 1 (Wednesday, February 4, 2004)

  09:00 - 10:30 Agenda Planning, Work Group Schedule, Requirements
Review

  10:30 - 11:00 Break

  11:00 - 12:30 TT-AF-1-0-REQ Comment Review, Finalization

  12:30 - 13:30 Lunch

  13:30 - 15:00 Profile Discussion

  15:30 - 16:00 Break

  16:00 - 17:30 Profile Discussion

Day 2 (Thursday, February 5, 2004)

  09:00 - 10:30 Example Walk Throughs

  10:30 - 11:00 Break

  11:00 - 12:30 Timing Vocabulary

  12:30 - 13:30 Lunch

  13:30 - 15:00 Content Vocabulary

  15:30 - 16:00 Break

  16:00 - 17:30 Animation Vocabulary

Day 3 (Friday, February 6, 2004)

  09:00 - 10:30 Style Vocabulary

  10:30 - 11:00 Break

  11:00 - 12:30 Style Vocabulary

  12:30 - 13:30 Lunch

  13:30 - 15:00 Descriptive Semantic Vocabulary

  15:30 - 16:00 Break

  16:00 - 17:30 Metadata Vocabulary

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Day 1 (Wednesday, February 4, 2004)
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

  09:00 - 10:30 Agenda Planning, Work Group Schedule, Requirements
Review

[GA] Proposes agenda. Minor changes made. Agenda accepted. [DS] will
send to reflector for [SH], [EH], [MD] to see.

[SH] Write spec in modularized fashion, then work out profiles after.

[DK] Start to build up matrix of what is in profile(s) to select areas
of requirements.

REQUIREMENTS REVIEW

  Review Bert Bos Comments

    * 1.2 System model

    How about a model of the timed text itself? processing, timing,
    structure.

[GA] Appears to be asking for figure(s) that show structure. Unclear
about meaning of "processing/timing".

Action: [GA] Add figure showing logical structure anticipated
by requirements.

[GA] Try to get BB on telecon and/or IRC.

[GA] Describe on white board in length the distinction between
logical flowed content (body), presentation flowed content (flows),
and non-flowed content (areas).

[GA] Describes analogy of these separate levels as follows:

(1) consider news articles for day to be a logical flowed
content body;

(2) some process maps this set of articles (body) to a set of
flows by selecting articles, associating those articles with
a flow, ordering those articles within a flow, associating
style parameters with those articles within flow, and associating
presentation timing with those articles;

(3) some process maps the articles within a flow to a region,
performing line layout, and associating detailed timing
with the resulting output which takes form of "present
glyphs G0, G1, ..., at positions P0, P1, ..., at times T0,
T1, ...;

The result of step (2) is presentation flowed content "flows"
structure.

The result of step (3) is an non-flowed content "areas" structure.

    * S0000

    What does captioning need, precisely? Color, fonts, font size,
    indents, bullets, images, positioning, timing, font styles,
    underlining, blinking, text shadow, background, transparency,
    sections/groups, repeating blocks, tabulation, right alignment,
    centering, vertical text, real-time authoring...

[GA] This will become clear from evaluating the expected creation of
a caption/subtitle profile in the TT-AF-1-0. See response from John
Birch.

    * S001

    Ditto

[GA] Expect that this will be covered by caption/subtitle profile.

Side Bar Resolution: Subitling and Captioning sometimes express
different intentions; however, mechanism for both are either
identical or close enough that a distinction is not warranted. This
is relevant to the point of whether to define separate profiles.

Question: What to call this profile? Candidates:

* Caption and Subtitle Profile [3 YES, 3 NSO]
* Subtitle Profile [0 YES, 6 NSO]
* Caption Profile [0 YES, 6 NSO]
* Not A Profile [1 YES, 5 NSO]

    * S002

    Probably needs speech generation support, such as CSS audio
properties
    or another transformation to SSML.

[GA] Yes; no action needed; already covered in spec.

Open Issue: Determine whether retain CSS audio properties in
TT-AF-1-0-REQ? If not, then what is alternative: (1) some use of
SSML? (2) leave to future work.

Resolution: Change "shall" to "may be" in R305 and R504 of
TT-AF-1-0-REQ.

    * S003

    Is this also intended to be usable to do "marquee" in HTML
    (embedded in an OBJECT or IMG element)?

[GA] Refering to a TT document from an OBJECT element in HTML in
order to do a marquee presentation of text is certainly possible,
although this wasn't considered explicitly when we constructed this
use case scenario.  What we were thinking of was having a real
Marquee Sign on the front of your building (etc.) that uses TT to
author content for it. In this context TT is considered generic since
it does not accompany any other media format and contains its own
internal time base.

    * R100

    What is meant by "authored using XSL"? Does that mean the TT
    AF can be the result of a transformation from some other XML
    format? In that case, why insist on XSL, why not Perl, e.g.?

[GA] This means we are using XMLSpec to author our spec. It doesn't
mean anything about what goes in our spec (content wise).

    * R101 - R103

    One hopes that the TT AF is simple enough to not need modules or
    optional parts...

[GA] In order to accommodate different profiles, we expect modules to
provide abstraction for grouping functionality. We do expect there to
be some profile that is semantically simpler than some other profile.

    * R106

    This seems to say that the TT AF should not contain functions that
    serve no purpose, but it says it in a rather verbose way. Unless
    I misunderstand, this seems rather obvious...

[GA] Yes, but try doing a Dedekind Cut in as few words.

    * R110

    What is an "idealized" streamable format?

[GA] It is a non-concrete, i.e., abstract streamable format. We are
trying to not commit to transformability into any particular
streaming format, and suggest that there is such thing as an abstract
streamable format that provides representative model for all
streamable formats.

    * R112

    The task of the TT WG is to define a TT AF (and probably a TT
    format), not to define the editor to write that format with.
    (Unless you make a case why you need to do this, and probably
    update the group's charter as well.)

Resolution: Change to read as follows:

"The TT AF specification(s) shall be defined in such a manner as to
permit the construction of a TT AS that satisfies all applicable
aspects of [ATAG 1.0]."

    * R204

    This requirement only restricts the element and attribute
    names of the TT AF to ASCII, since R100 (use of XML) already
    ensured that all text content can be written in ASCII. So why
    not sayexplicitly that this item is about element and attribute
    names?

Response: At this point in requirements, we have not yet specified
that we require use of XML; that comes later in R290, which we have
characterized as a solution space requirement. R204 on the other hand
is a true functional requirement independent of R290. At worst
leaving in R204 is slightly unecessary.

    * R209

    This makes sense, but some motivation would be good. How about
    headings and lists?

[GA] Headings are merely a semantic use of a paragraph or span. Lists
are also semantics uses of paragraph. We don't need to disinguish at
logical flowed content level in terms of the specific element type;
rather, we posit the use of a role/class attribute to provide this
additional semantic layer.

Note: We need to resolve question of whether to support both role and
class or only role or class and which one.

    * R217, R218

    "Embedded" means "in the same file"? Such as a data URL? Or is
    it an external image intended to be displayed simultaneously, while
    "non-embedded" means "intended as hyperlink"?

[GA] By embedded we mean something like data: URI usage.

[GA] By non-embedded, we mean referencing an external resource/file.

    If the former, is it also permitted to have the TT AF and the
    image together in a file of a third type, such as a "jar" file?

Yes, however, that would be considered a non-embedded case, since the
data is not embedded in the TT format directly. Otherwise, this is
out of scope.

    If so, is it OK if that third format is a generic archive format,
    or should it have a MIME type that indicates that this is an
    archive used as TT AF (though structurally equal to a generic
    format)?

Note that Java JDK1.3.X provides a "jar:" URI scheme that can be used
to refer to JAR (or ZIP) item entries.


    * R219, R220

    Not by inventing a new font format, I hope...

Correct.

    Any idea yet whether there will be a one or more required
    font formats (TrueType, SVG) or is it OK when a UA supports at
    least one font format, even if it is the only UA to know that
    format?

No, because we don't know anything about UA.

    * R221

    The sentence is hard to read or maybe even ambiguous. What does
    "appropriate domain of discourse" mean? Is it a modifier of "text
    content" or of "descriptive information"? Is the idea that
    you can embed a TEI file in the TT AF?

Domain of discourse means the domain in which descriptive information
is scoped.

See note. NO, it isn't the idea that we would embed TEI files;
however, we may choose to adopt some TEI vocabulary if deemed
important and useful.

    * R222

    This sounds rather ambitious. I thought TT was a mono-media
component,
    to be used, e.g., inside SMIL, not a SMIL-replacement.

This usage, as the notes indicates, is intended to be simply for
aural cueing.  It is not intended for general audio usage or
synchronizing audio with timed text. Furthermore, it would be
difficult to use SMIL to refer to a timed text object with internal
presentation time line semantics that require performing audio
cues/leitmotifs using SMIL synchronized audio.

Consider used timed text to present a visual musical score of a
Wohltemperierteklavier with accompanying musical cues. Now you get
the idea. PDQ Bach anyone?

Action: [GA] Add note to R217 and R219 that shows use of data: URI
scheme.

    * R223

    What does "non-embedded" mean? Does it mean that there is no link
    to the audio in the TT AF itself, but the link is somehow somewhere
    else (such as in a style sheet)? Or, which is maybe the same thing,
    that the TT AF only expresses that there is to be audio of a certain
    kind (e.g., via high-level keywords, such as "alert," "warning" and
    "error"), without pointing to actual sound files?

Non-embedded means not contained in file. Use of a "src" or
"xlink:href" will reference external file. The link would be in the
TTAF content.

    * R292, R293

    No objection to using XLink, XML Schemas or Relax NG, but why is
    it a *requirement* to use them? Why not just an intention? What
    breaks if you use something else?

These are solution space requirements that we agreed to adopt in
principle because we have some biases about where we want this to go.

    * R300

    R301 seems to be a more precise statement of R300. It seems that
    R300 can be removed.

R300 addresses basic functional need, while R301 addresses forms that
this function may take. The former is independent of the latter.

    * R301

    Why do you need attributes on elements for the TT AF? Attributes
    seem redundant, when you also have external styles and even
    physically embedded styles. There is nothing you can do with
    attributes that you cannot also do with style sheets, but style
    sheets can do more.

When using the presentation flowed vocabulary, all style information
has been mapped from stylesheets to content; it is no longer
efficient at that point to maintain separation of style and content.
This is a similar model to that found with XSL, which also implies an
XSLT mappping process to associate style with content.

Open Issue: Whether and what constraints to define regarding
different forms of style usage at different levels of abstraction:
LF, PF, NF.

<head>
  <stylesheet units="cell">
    <style id="s1">
      <prop name="color">yellow</prop>
    </style>
    <style id="s2">
      <prop name="color">green</prop>
    </style>
    <style id="s3">
      <prop name="color">red</prop>
    </style>
  </stylesheet>
</head>
...
<block styleref="s1">joe says</block>
<!-- referencing stylesheet -->
<block styleref="s2">jane <inline styleref="s3">says</inline></block>
<!-- inline style -->
<block styleref="s2">jane <inline
style="color:red">says</inline></block>
<!-- style attribute -->
<block styleref="s2">jane <inline color="red">says</inline></block>
...

CSS style usage

<style id="s3" select=".red">
  <prop name="color">red</prop>
</style>
...
<!-- referencing stylesheet -->
<block styleref="s2">jane <inline class="red">says</inline></block>

Open Issue: Need to better define what forms of style references and
style applications occur at each of 3 levels: body, flows, areas.

Open Issue: Need to determine whether to continue to use a style
application process (style refs content), albeit simplified, at PF
(flows) and NF (areas) levels; or, to use a style selection process
(content refs style elts).

Proposed Rewrite of R301

The TT AF shall be capable of specifying inline styling by means of
one or more of (1) distinct attributes, (2) a generic attribute,
e.g., style, and (3) one or more inline stylesheets. Depending upon
the context of use, some constraint(s) may apply to which of
these means are available for use in that context.

    The two reasons I can think of for allowing attributes are
    (1) ease of hand authoring for quick & dirty projects (a
    rather weak argument) and;
    (2) ease of processing, since no memory is required to store
    style sheets (but that doesn't hold here, because style sheets
    have to be supported anyway).

    Maybe this was intended as a requirement for the TT DF instead?

See above.

    * R305

    It might be good to refer to SSML and the upcoming CSS speech
    module, since the aural properties of CSS2 will be deprecated
    (in CSS 2.1) and there will be a new set of properties in CSS3,
    compatible with SSML. They should be very similar to the old ones,
    but not exactly the same.

Agreed.

    * R307

    Not sure if I interpret this correctly. Is this like scrolling
    text, like a "marquee"?

Not exactly. This is intended for use in case where authoring system
has insufficient information to make line break decisions and
scrolling presentation of lines is contingent upon available space
and available time. These properties provide a higher level
abstraction of indicating intended presentation behavior in such
cases.

    * R390

    See R301. It seems to me that hard-coded styles should be avoided
    where possible and only allowed in final-form formats, like a TT DF.
    (The principle of separation of structure and style is a relative
    principle, but it seems to me that it should hold for the TT AF.)

The TT AF supports final form content through the use of non-flowed
content vocabulary. It is considered reasonable to use attributed
styles at this low level of abstraction.

Resolution: In R390, change "shall be" to "may be".

    * R391

    It's a good principle to use existing names and definitions where
    possible, but don't deprive yourself of the possibility to use names
    that fit better with the particular model or syntax that you
develop.

Agreed.

PROFILE DISCUSSION

Wish List

(1) Authorial Areas

    * Caption and Subtitle
    * Ticker Tape
    * Digital Talking Books

(2) Distribution Isomorphism

    * EIA 608/708
    * 3GPP / MPEG-4 Part 17
    * Enhanced Teletext

[DS] Profile namespace must be extensible, allowing non-W3C profiles.

Possible Questions:

1. [GA] should profiles be hierarchical in any way;
2. [SH] what dimensions along which to consider profiles: (a) visual
   vs spoken; (b) styling complexity;
3. [DS] profiles need not nest
4. [GA] how many profiles to define? [SH] at least one

What Would Not Be Required for EIA 608/708, 3GPP?

(0) extrinsic content not required (for reuse as basic DF)
(1) LF vocab not required for DFX
(2) NF vocab not required
(3) audio, image, font embedding or non-embedding not required
(5) conditional content not required
(6) aural styles not required
(7) exclusive time container not required
(8) fade transition not required

What Is Required for EIA 608/708, 3GPP?

(0) extrinsic content (for author usage only)
(1) LF vocab may be required for C/S authoring, but not DFX
(2) PF vocab is required
(3) hyperlinking required (for 3GPP)
(4) a subset of visual style params are required
(5) seq and par timing contains required
(6) scroll, highlight, subset of visual style animation required
(7) temporal fill mode (?)

Resolution: Give priority to characterizing a Profile, nominally
known as the "C/S" profile, that supports the unified functions need
for (1) C/S authoring and (2) 708/3GPP DFX.

Guideline: Define modularity well. Use omniscience as much as possible.

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Day 2 (Thursday, February 5, 2004)
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

EXAMPLE WALK THROUGHS

3GPP Timing Model

1. Every sample stands alone, contains intrinsic text and style
overrides, and well defined time duration against a time line.

2. A sequence of text events with explicit duration.

3. In SMIL terms, a single 3GPP file is a single time sequence
container, where each item has an explicit duration, with implied
begin time. Requires padding for empty slots; i.e., a sample with
duration but no content.

4. In order to handle overlapping time intervals for different
text regions, it is necessary to use different text tracks (in same
file), each of which are independent but have synchronized time
lines. In this sense, multiple text tracks are like a SMIL par
container.

5. 1 text track : 1 text region : 1 text box : 1 text string (= one
sample);
text region's position and size are static; text boxes can be
moved.

6. Track has ticks per second parameter, then times are in ticks;

3GPP Style Model

1. Default styles associated are located in track header, with
multiple default styles possible, each having a numeric index.
In track header, can specify background color for either region
or box. Whichever one is not specified is considered transparent.

2. Style overrides on range of characters. Style override ranges
cannot overlap. For a given character, there can be at most one
applicable style override range that includes that character.
Each sample description includes reference to some default style
using a numeric index, then can add style overrides for multiple
ranges.

3. Default styles and overrides permit specifying the following
styles: font reference (defined in track header), style flags
(bold, italic, underline), font size, and text color.

4. Font definition includes, for each def, a list of font face
names separated by comma, and a numeric id. Face names include
logical names serif, sans-serif, monospace; also font names
can be terminal font face names.

5. Basic format defines line breaking as manual (using hard line
break - LINE SEP, PARA SEP, and LINE FEED; see Unicode TR #13).
Release 6 adds soft wrap flag to style overrides (per sample)
that allows terminal to soft wrap; otherwise, must not soft wrap.
Clipped to text box in case of overflow.

6. On a per sample basis, can specify scroll in delay, scroll out
delay, scroll direction (absolute). In case of multiple lines, then
all lines scroll simultaneously.

7. On a per override basis, can specify hilite and blink. Hilite
color can be optionally specified; used for background color. Hilite
and blinking are overrides on top of normal style overrides; but
can't do hitite and blink at same time. Terminals are not required to
support blinking, and all details are terminal dependent.  Only one
hilite color can be specified per sample. However, multiple text
foreground colors can apply (via overrides) within a sample.

8. Dynamic hiliting (karaoke style), for a given hilite, specifies
times and ranges of characters. Cannot use dynamic and static
hiliting on same text. However, within sample, can have both dynamic
and static hiliting at once. Uses same hilite color value as static
color, or, if none specified, then inverts colors. Can only have one
dynamic hiliting per sample. Note that if continuous karaoke is used,
then this may or may not prevent using static hiliting.

9. Hyperlink - url and alt string; used to invoke external content,
and not to seek within text track;

Side Bar On Style Conventions

[GF] Uses italics within sample for emphasis and sound effects.

[DK] Typically use different colors for indicating different speakers;
otherwise uses dash.

[GF] In realtime captioning, uses ">>" for new speaker intro.

[DK] Also has guidelines to use different background color on text
ranges
for electronic voices, electronic sounds.

[GF] For sound effects, use italics in parens ([DK] or square brackets).

Possible 3GPP Representation in TT-AF

<head>
  <defs>
    <!-- font definitions -->
    <font id="f1">
      <prop name="faceName" value="sans-serif"/>
    </font>
    <!-- style definitions -->
    <style id="s1">
      <!-- following can't be overridden in text samples -->
      <prop name="textAlign" value="left"/> <!-- left, center, right -->
      <prop name="displayAlign" value="top"/> <!-- top, center, bottom
-->
      <prop name="scroll" value="none"/> <!-- none, in, out, both -->
      <prop name="scrollAxis" value=""/> <!-- lr, rl, tb, bl -->
      <prop name="karaoke" value="continuous"/> <!-- discrete,
continuous -->
      <prop name="writingMode" value="auto"/> <!-- auto, lr, rl, tb -->
      <!-- following can be overridden in text samples -->
      <prop name="font" value="f1"/>
      <prop name="fontSize" value="12"/> <!-- nominal pixels -->
      <prop name="fontStyle" value="italic"/> <!-- italic or normal -->
      <prop name="fontWeight" value="bold"/> <!-- bold or normal -->
      <prop name="color" value="#C0000080"/> <!-- RGBA -->
      <prop name="textDecoration" value="underline"/> <!-- underline or
normal -->
      <prop name="wrapMode" value="false"/>
    </style>
    <style id="s2">
      <prop name="fontWeight" value="normal"/> <!-- bold or normal -->
      <prop name="color" value="#FFFF00FF"/> <!-- opaque yello -->
    </style>
    <!-- region definitions -->
    <!-- outer region #1; always static position and size -->
    <region id="r1">
      <prop name="origin" value="10,20"/> <!-- top, right -->
      <prop name="extent" value="100,200"/> <!-- width, height -->
      <prop name="bgColor" value="#00000000"/> <!-- transparent -->
      <prop name="zOrder" value="1"/> <!-- z-index -->
    </region>
    <!-- inner text region; position and size can change over time -->
    <!-- this corresponds to 3GPP text box -->
    <region id="r2">
      <prop name="parent" value="r1"/> <!-- parent region -->
      <prop name="origin" value="10,20"/> <!-- top, right -->
      <prop name="extent" value="100,200"/> <!-- width, height -->
      <prop name="bgColor" value="#00000080"/> <!-- half transparent
black -->
      <prop name="defaultStyle" value="s1"/>
    </region>
    <!-- another text box (region) -->
    <region id="r3">
      <prop name="parent" value="r1"/> <!-- parent region -->
      <prop name="origin" value="10,200"/> <!-- top, right -->
      <prop name="extent" value="100,200"/> <!-- width, height -->
      <prop name="bgColor" value="#00000080"/> <!-- transparent -->
      <prop name="defaultStyle" value="s1"/>
    </region>
    <!-- outer region #2; always static position and size -->
    <region id="r4">
      <prop name="origin" value="100,200"/> <!-- top, right -->
      <prop name="extent" value="100,200"/> <!-- width, height -->
      <prop name="bgColor" value="#00000000"/> <!-- transparent -->
      <prop name="zOrder" value="2"/> <!-- z-index -->
    </region>
    <!-- another text box (region) used in second top level region -->
    <region id="r5">
      <prop name="parent" value="r4"/> <!-- parent region -->
      <prop name="origin" value="200,10"/> <!-- top, right -->
      <prop name="extent" value="100,200"/> <!-- width, height -->
      <prop name="bgColor" value="#00000080"/> <!-- transparent -->
      <prop name="defaultStyle" value="s1"/>
    </region>
  </defs>
  <timing>
    <par>
      <seq>
        <cue select="f1b1" begin="10" dur="10"/>
        <cue select="f1b2" begin="30" dur="10"/>
        <cue select="f1b3" begin="40" dur="30"/>
      </seq>
      <seq>
        <cue select="f2b1" begin="10" dur="10"/>
        <cue select="f2b2" begin="30" dur="10"/>
      </seq>
    </par>
  </timing>
</head>
<flows>
  <!-- text track 1 -->
  <flow defaultRegion="r2"> <!-- timeContainer="seq" by default -->
    <block id="f1b1">Sample 1</block>
    <block id="f1b2">Sample 2<br/> with two lines</block>
    <block id="f1b3" region="r3">
      Sample 3 with <inline style="s2">overrides</inline>
    </block>
  </flow>
  <!-- text track 2 -->
  <flow defaultRegion="r5"> <!-- timeContainer="seq" by default -->
    <block id="f2b1">Sample 1</block>
    <block id="f2b2">
      Sample 2 with <a xlink:href="http://acme.com/"
alt="Pinnacle">link</a>.
    </block>
  </flow>
</flows>

[SH] Since text has no intrinsic duration, we need to require that
end or dur is always known.

[GA] But should still be possible to no specify end/dur on leaf
content element, provided that some parent scope has explicit end/dur.

[EH] This would apply to par.

[DS] Or to last element in seq.

CURRENT CONSENSUS

1. Time model at LF is app specific; need to consider details of how
to represent, but using app specific semantics.

2. App specific transform of time from LF to PF.

3. Time model at PF is SMIL based (a subset), with par, seq elts
and begin, end, dur attrs, and timeContainer attr (at minimum).

SEPARATION OF TIME MODEL (TM)

Glossary

LF logical flow (body)
PF presentation flow (flows)
NF non-flowed (areas)

CM content model
TM time model
LM layout model

[GA] Don't require separation of TM and LM at PF layer.

[GA] However, permit it, if desired by author.

LF - TM(L) selects CM, LM selects CM

PF - TM(P) selects CM, LM merged with CM (but uses ext refs for
efficiency) (but uses ext refs for efficiency)

NF - TM(P) merged with CM, LM merged CM (but uses ext refs for
efficiency)

[SH] At PF layer, could continue to use model of LM selecting CM.

PF Layer Timing Model - Support structural containment timing as
found in SMIL and, if author wishes, can represent absolute times for
everything with no structural timing containment.

WG: Agrees.

NF Layer Timing Model - Support only absolute times for everything
with no structural timing containmen. In other words, timing has been
flattened at this layer.

WG: Agrees.

Where To Specify Timing

LF

PF - either (1) outside content, with timing specification selecting
content or (2) directly on content elements, but only in the case of
flattened time model.

NF - directly on content elements (i.e., areas and glyphs) only.

ATTEMPT TO WORK OUT EXAMPLE AT LF USING BBC SCRIPT

Example #1 - Times are Actual Word Times

<body role="script">
  <div role="scene">
    <p role="title">...</p>
    <p role="speaker" who="joe">
      <aml:confidence level="30"/>
      <span id="w0001" aml:begin="1.150" aml:end="1.340">woah</span>
      <span id="w0002" aml:begin="1.370" aml:end="1.540">Dad!</span>
      <span id="w0003">...</span>
    </p>
  </div>
</body>

Example #2 - Times are Paragraph Display Times

<body role="script">
  <div role="scene">
    <p role="title">...</p>
    <p role="speaker" who="joe" begin="1.150" end="...">
      whoa Dad! ...
    </p>
  </div>
</body>

Potential Problem

<tt xmlns:aml="http://www.bbc.co.uk/AML/">
<p aml:who="joe">...</p>
</tt>

{http://www.bbc.co.uk/AML/}who

p[who=joe]

p[{http://www.bbc.co.uk/AML/}who=joe]

p[aml:who=joe]

@namespace aml url(http://www.bbc.co.uk/AML/)
p[aml|who=joe]

[SH] CSS3 appears to add syntax to handle namespace qualified elements
and attributes.

[SH] Thinks there should be a concrete time model that can bridge
between LF and PF layers.

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Day 3 (Friday, February 6, 2004)
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

SOME POSSIBLE DISPOSITIONS OF STYLE APPROACH AT PF LAYER

1. use applicative mode only
2. use referential mode only
3. support either, but not simultaneously

Applicative Mode: CSS Style, which entails (1) style rules with
relatively complex selectors; (2) application of rules via cascading
logic to elements in DOM structure; (3) after application,
inheritence via DOM structure.

Referential Mode: XSL FO Style, which entails (1) already having
performed applicative style in order to produce computed styles on
each element; (2) aggregation of groups (sets) of style properties
into separate set of style declarations; (3) backward referencing to
those groups from elements. Note that this referencing is merely a
representation optimization, since it is possible to specify all
styles on each element as well.

Distinction #1: Applicative Mode potentially requires all content
from the beginning of content to be in memory (as DOM structure) in
order to apply selection logic. This means that there is no random
access when using Applicative Mode; i.e., you have to store all
content from beginning before applying styles, and may, depending
upon whether rules allow selection based on right-hand subtree
content, to have entire document in memory.

Distinction #2: Referential Mode may permit only one set of styles
if elements can refer to only one style.

Straw Poll: No clear consensus on options. Don't want to rule
out options at this point.

Context for Option Usage:

Option #1 - CSA (Caption/Subtitle Authoring)
Option #2 - DFX (Distribution Format Interchange/Transform)

Working Assumption: Adopt Option #3 (either applicative or
referential, but not simultaneously; with context in which
some mode used made explicit). Move forward with developing
examples and spec. If we can convince ourselves in future
to drop one mode, then we will fall back to Option #1 or
Option #2.

WG: Agreed.

Note the above applies only at PF layer.

DISPOSITION OF STYLE APPROACH AT LF LAYER

Working Assumption: Don't try to specify TT defined style
vocab/usage at LF layer unless and until we can convince
ourselves there is something worth standardizing; otherwise,
styling information and mode of application is in app
domain only.

WG: Agreed.

[SH] Describes worked out AML example rewritten in quasi-TT.

... lengthy discussion ...

[GA] We discussed having within a single TT instance document
all of: LF, PF, and NF sub-trees; should we permit more than
one PF and more than one NF sub-tree?

WG: Yes; allow more than one PF and more than one NF.

[SH] Will need mechanism that links NF to PF that generated NF.

Action: [TM] Find out if xpointer() scheme WD [1] is still being
progressed forward.

[1] http://www.w3.org/TR/xptr-xpointer/#document-order-notation

[GA] Notes that [SH] proposal depends upon having mechanism
to select words or ranges of characters.

[SH] But can work without this if every potential boundary
is spanned by <inline>...</inline>.

Proposed Resolution: For style property elements, specify
value of property using "value" attribute. Consider providing
future extensibility mechanism that permits use of element
content to provide value for style properties whose value
is employed as (human consumable) content.

Option #1
<prop name="x" value="y"/>

Option #2
<prop name="alt">Alternative Text</prop>

Not Well Defined
<prop name="x" value="y">Another Value</prop>

Option #3
<prop>x:y</prop>

Option #4
<prop css:wrap-mode="yes"/>

Option #1A
<prop name="css:wrap-mode" value="yes"/>

Option #5 (combination of #1 and #1A)
<prop css:foo="bar"/>
<prop name="foo" value="bar"/>

Variant of #1A
<prop style:foo="bar"
xmlns:style="http://www.w3.org/2004/tt-af-1-0#style"/>

Consensus: Go with Variant of #1A approach.

POSSIBLE SYNTAX FOR APPLICATIVE MODE

<styleSheet select="media(WorldTeleText)">
  <!-- an applicative only style rule; no id, can't reference -->
  <style select="div[role='fred']">
    <prop style:color="red"/>
  </style>
  <!-- a referential only style rule; no select, can't apply -->
  <style id="s1">
    <prop style:color="green"/>
  </style>
  <!-- an applicative or referential style; has select and id -->
  <style id="s2" select="div[role='barney']">
    <prop style:color="blue"/>
  </style>
  <!-- an hierachical reference chain -->
  <style id="s3" use="s2">
    <prop style:fontSize="72px"/>
  </style>
</styleSheet>

Requirements of Syntax

(1) select stylesheet based on media predicate and possibly
other predicates;

(2) select rule to apply to specific content (alternate view:
select content to which rule is applied);

(3) specify set of style properties that are associated with
rule;

(4) specify set of style properties in a group to be referenced
via referential mode;

(5) permit hierarchical chaining of referencing styles, i.e.,
allow something like "use" attribute in some style to refer
to another style in order to include other style's properties;

(6) consider use of XInclude as substitute for CSS Inclusion
mechanism;

POSSIBLE VALUE SPACE FOR SELECT ATTRIBUTE(S)

Option #1A - CSS2   selectors
Option #1B - CSS2.1 selectors
Option #1C - CSS3   selectors

[SH] CSS2/2.1 won't work since not NS aware.

Option #2 - XPath

Option #3 - XPointer (which includes and extends XPath with range)

Working Assumption: Choose Option #2 in full glory for TT Full
Profile; adopt or specify some subset for TT CS Profile. Note that
we probably need to add extensions as well, e.g., extensions like
those found the xpointer() scheme.

Action: [SH] to propose subset with extensions for use in CS
Profile.

[TM] Would same option be used for timing selectors?

WG: Yes.

ADDITIONAL POSSIBLE REQUIREMENT ON SYNTAX FOR REFERENTIAL MODE

(7) Need to define regions, possibly hierarchical. May want
to reuse style properties or separate out. Needs consideration.

Action: [GA]/[GF] investigate syntax for regions vis-a-vis style.

SIDE-BAR

Whether to allow intermixture of both applicative mode (select) and
referential mode (id/style/use) in single PF layer instance (i.e.,
flows element)? Answer: within single PF layer instance, processing
should be base on one or other, but not both; however, it is
conceivable that a single external stylesheet may be referenced by
both an applicative mode PF instance and a referential mode PF
instance.

[DS] How would we mark flows as to whether applicative mode
or referential mode applies?

[GA] Yes. Mechanism is TBD.

DESCRIPTIVE SEMANTIC VOCABULARY (AT LF LAYER ONLY)

Shopping List of Potential Standard Vocab Items

(1) role attribute - usage: provides semantic specialization
of general purpose container or content element

(2) role attribute values

Content Organizational Roles

* act
* chapter
* epilogue (see TEI P4 CH10)
* part
* prologue (see TEI P4 CH10)
* scene
* script (possible alternatives: discourse, performance, show)

<div role="epilogue">
  <p role="utterance" who="narrator">
    Ophelia, while obedient, descends to Hades for taking life.
  </p>
</div>

Content Description Roles

* action
* caption (burned into video/film/work)
* kinesic (see TEI P4 CH11)
* music
* pause (see TEI P4 CH11)
* phrase
* shift (see TEI P4 CH11)
* song
* sound
* thought
* title
* translation
* utterance (see TEI P4 CH11)
* vocalization (non-lexical)
* writing (see TEI P4 CH11)

<span role="music" aml:dur="00:02:48:00>
  [Bach playing in background.]
</span>

[GA] Thinking here is that we could define some role
tokens that we think are useful, then allow authors to
use arbitrary other tokens; note that this may have to
deal with name collision issues (or not).

[GA] 2 uses of role attribute: (1) to provide details about
an organization of content (components), typically used with <div>
and <body> elements; and (2) to provide details about
content, typically used with <p> and <span> elements.

Action: [DK] To write up short paragraph on each of
above role tokens. Suggest removing or adding as he
progresses.

(3) agent attribute - usage: provides details of source of
content, e.g., speaker, singer, burper, etc.

(4) agent attribute values - probably should be merely one or more
IDREFs that refer to metadata that describes agents and aliases of
agents; need to define metadata; note that probably need agents that
are collections rather than indivduals; need to be able to refer to
agents not in script;

<metadata>
...
<agent id="agent1">
  <name>Professor Aubrey Singer</name>
  <alias id="agent1a1">Aubrey Singer</alias>
  <alias id="agent1a2">Aubrey</alias>
</agent>
...
</metadata>

...
<p agent="agent1 agent2">Hello Class!</p>
<p agent="agent1a2">Darling!</p>

(5) recipient attribute - synonyms: receiver, addressee;
usage: intended receiver of content

(6) recipient attribute values - probably should be merely one or
more IDREFs that refer to metadata that describes recipients and
aliases of recipients; need to define metadata; note that probably
need agents that are collections rather than individuals; need to be
able to refer to agents not in script;

(7) tone attribute - synonyms: mode; usage: annotate
extensional semantics of pragmatics that apply to content;
e.g., polite son-to-father, angry father-to-son, concerned
mother-to-daughter, ...

[DS] seems a real rat hole: translation: this is a slippery slope...

METADATA VOCABULARY

Issues:

(1) What layers to permit MD?

[GA] Proposes: all.

WG: Agreed.

(2) What layers to define standardized usage of MD items (as opposed
to standard MD containers)?

[GA] Proposal #1: LF only; optionally permit them to seep through
to PF/NF?

[DS] Why harmful in other layers?

[GA] Withdraw above proposal.

[GA] Proposes following uber-structure.

DOCUMENT STRUCTURE

<tt>
  <!-- header that covers all layers -->
  <head>....</head>
  <!-- logical (LF) layer -->
  <layer role="logical">
    <head/>
    <body/>
  </layer>
  <!-- presentation (PF) layer; instance 1 -->
  <layer role="presentation">
    <head/>
    <flows/>
  </layer>
  <!-- presentation (PF) layer; instance 2 -->
  <layer role="presentation">
    <head/>
    <flows/>
  </layer>
  <!-- final form (NF) layer; instance 1 -->
  <layer role="final">
    <head/>
    <areas/>
  </layer>
  <!-- final form (NF) layer; instance 2 -->
  <layer role="final">
    <head/>
    <areas/>
  </layer>
</tt>

WG: Agrees.

[GA] Proposal #2: allow standard MD at least in any of the
head elements shown in above document structure.

WG: Agrees.

(3) Do we need a standard MD container element(s)?

WG: Agrees.

(4) Do we need a standard MD attribute(s)?

[GA] Originally no; changes mind when [DS] suggests xml:lang
is possible standard MD attribute.

Issue: Need to enumerate and define.

Action: [TM] to propose and define standard MD attributes.

(5) Should we allow non-std MD anywhere, nowhere, only in standard MD
container, etc.?

WG: Anywhere. Must be in foreign namespace.

(6) Need to define standard MD element types and content as opposed
to attributes.

Standard MD Element Type Candidates

* agent
* agents
* event
* events
* location
* locations
* object
* objects
* recipient
* recipients
* resource
* resources

Option #1

<item role="agent" id="agent1">
  <prop md:name="name">Professor Aubrey Singer</prop>
  <prop md:name="alias">Aubrey Singer</prop>
  <prop md:name="alias">Aubrey</prop>
  <prop md:gender="male"/>
</item>

Option #2

<item role="agent" id="agent1">
  <prop name="name">Professor Aubrey Singer</prop>
  <prop md:gender="male"/>
</item>
<item role="agent" id="agent1aka1" use="agent1">
  <prop name="alias">Aubrey Singer</prop>
</item>
<item role="agent" id="agent1aka2" use="agent1">
  <prop name="alias">Aubrey</prop>
</item>

WG: Option #2 is better.

SIDE BAR on use of xml:lang in MD Items

WG: Should be able to use xml:lang to distinguish not-only different
instances of MD content that is language sensitive, but also non-MD
content (i.e., data content) similarly. However, should discourage
placing multiple languages of all content in single document (file).

<item role="agents">
  <item role="agent"/>
  <item role="agent"/>
</item>

Action: [*] Think about standard MD items (and write something
down and send to list so we can think about it too).

UPDATE on SYMM ACTIVITY

[TM] No SYMM WG meeting in Cannes. SYMM WG is not yet active;
however, W3C is interested in reactivating if there is sufficient
interest.

WG: Suggestion to have either a BOF or a Task Force in Cannes to
discuss details of SYMM WG restart. Suggest completion of charter
until after such a meeting.

Next Meetings

FEB 19, 2004  - telecon; regrets: [GA]
FEB 26, 2004  - telecon; regrets: [DS]
MAR 3-5, 2004 - f2f, Cannes

Action: [GA] Need to start planning for June meeting in Japan!

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
START SUMMARY
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

*** RESOLUTIONS ***

Side Bar Resolution: Subitling and Captioning sometimes express
different intentions; however, mechanism for both are either
identical or close enough that a distinction is not warranted. This
is relevant to the point of whether to define separate profiles.

Resolution: Change "shall" to "may be" in R305 and R504 of
TT-AF-1-0-REQ.

Resolution: Change R112 to read as follows:

"The TT AF specification(s) shall be defined in such a manner as to
permit the construction of a TT AS that satisfies all applicable
aspects of [ATAG 1.0]."

Resolution: In R390, change "shall be" to "may be".

Resolution: Give priority to characterizing a Profile, nominally
known as the "C/S" profile, that supports the unified functions need
for (1) C/S authoring and (2) 708/3GPP DFX.

Resolution: For style property elements, specify values of properties
as shown below. Consider providing future extensibility mechanism
that permits use of element content to provide value for style
properties whose value is employed as (human consumable) content.

Option #1A
<prop name="css:wrap-mode" value="yes"/>

Resolution: PF Layer Timing Model - Support structural containment
timing as found in SMIL and, if author wishes, can represent absolute
times for everything with no structural timing containment.

Resolution: NF Layer Timing Model - Support only absolute times for
everything with no structural timing containmen. In other words,
timing has been flattened at this layer.

Resolution: Yes; allow more than one PF and more than one NF in
single instance document.

Resolution: All layers (LF, PF, NF) permit metadata.

Resolution: Document structure to be organized as follows:

DOCUMENT STRUCTURE

<tt>
<!-- header that covers all layers -->
<head>....</head>
<!-- logical (LF) layer -->
<layer role="logical">
<head/>
<body/>
</layer>
<!-- presentation (PF) layer; instance 1 -->
<layer role="presentation">
<head/>
<flows/>
</layer>
<!-- presentation (PF) layer; instance 2 -->
<layer role="presentation">
<head/>
<flows/>
</layer>
<!-- final form (NF) layer; instance 1 -->
<layer role="final">
<head/>
<areas/>
</layer>
<!-- final form (NF) layer; instance 2 -->
<layer role="final">
<head/>
<areas/>
</layer>
</tt>

Resolution: Allow standard MD at least in any of the
head elements shown in above resolution.

Resolution: Define and use standard MD container element.

Resolution: Allow non-standard MD anywhere, but must be in
foreign namespace.

Resolution: Choose Option #2 below for organizing standard
MD:

Option #1

<item role="agent" id="agent1">
  <prop md:name="name">Professor Aubrey Singer</prop>
  <prop md:name="alias">Aubrey Singer</prop>
  <prop md:name="alias">Aubrey</prop>
  <prop md:gender="male"/>
</item>

Option #2

<item role="agent" id="agent1">
  <prop name="name">Professor Aubrey Singer</prop>
  <prop md:gender="male"/>
</item>
<item role="agent" id="agent1aka1" use="agent1">
  <prop name="alias">Aubrey Singer</prop>
</item>
<item role="agent" id="agent1aka2" use="agent1">
  <prop name="alias">Aubrey</prop>
</item>

Resolution: Should be able to use xml:lang to distinguish not-only
different instances of MD content that is language sensitive, but
also non-MD content (i.e., data content) similarly. However, should
discourage placing multiple languages of all content in single
document (file).

Resolution: Propose either a BOF or a Task Force in Cannes to discuss
details of SYMM WG restart. Suggest completion of charter until after
such a meeting.

*** WORKING ASSUMPTIONS ***

Working Assumption: Adopt Option #3 (either applicative or
referential, but not simultaneously; with context in which some mode
used made explicit). Move forward with developing examples and spec.
If we can convince ourselves in future to drop one mode, then we will
fall back to Option #1 or Option #2.

Context for Option Usage:

Option #1 - CSA (Caption/Subtitle Authoring)
Option #2 - DFX (Distribution Format Interchange/Transform)

Working Assumption: Don't try to specify TT defined style vocab/usage
at LF layer unless and until we can convince ourselves there is
something worth standardizing; otherwise, styling information and
mode of application is in app domain only.

Working Assumption: Choose Option #2 in full glory for TT Full
Profile; adopt or specify some subset for TT CS Profile. Note that we
probably need to add extensions as well, e.g., extensions like those
found the xpointer() scheme.

Option #1A - CSS2   selectors
Option #1B - CSS2.1 selectors
Option #1C - CSS3   selectors
Option #2 - XPath
Option #3 - XPointer (which includes and extends XPath with range)

Same option to be used for timing selectors.

*** OPEN ACTION ITEMS ***

Action: [SH] Will investigate use of media queries in this context
and report back.

Action: [DS with help of Paul Nelson and Peter Lofting] Write RFC to
register appropriate opentype/truetype font types as MIME media
types, suggest model of "application/font-<font-type-name>", e.g.,
"application/font-truetype".

Action: [GA] Make proposal regarding use of Xlink vocabulary or
"src" attribute.

Action: [GF] Investigate whether to use IRIs instead of URIs?
Note: XPointer and Namespaces in 1.1 use IRIs?

Action: [SH] Investigate use of "role" vs "class" attribute.

Action: [GF] Investigate mechanism for cascading
semantics and whether to support cascading on either or both
logical and presentation flowed vocabularies.

Action: [GA] Draft new requirement on "Integrability"
in general terms that should not impact testing or implementation
requirements.

Action: [GA] incorporate agreed changes into TT-AF-1-0-REQ in
preparation for publishing final W3C Note.

Action: [SH] will review and propose subset of aural parameters
(see R305).

Action: [GA] Add figure showing logical structure anticipated
by requirements.

Action: [GA] Add note to R217 and R219 that shows use of data: URI
scheme.

Action: [TM] Find out if xpointer() scheme WD [1] is still being
progressed forward.

[1] http://www.w3.org/TR/xptr-xpointer/#document-order-notation

Action: [SH] to propose subset with extensions for use in CS
Profile.

Action: [GA]/[GF] investigate syntax for regions vis-a-vis style.

Action: [DK] To write up short paragraph on uses of
role tokens. Suggest removing or adding as he
progresses.

Action: [TM] to propose and define standard MD attributes.

Action: [*] Think about standard MD items (and write something
down and send to list so we can think about it too).

Action: [GA] Need to start planning for June meeting in Japan!

*** OPEN ISSUES ***

Issue: Whether to use XLink vocabulary, e.g., as used consistently by
SVG, or use "src" attribute as apparently will be done in XHTML2?

Issue: Whether to use IRIs instead of URIs? Note: XPointer and
Namespaces in 1.1 use IRIs?

Issue: Should we use "class" instead of "role"?

Issue: Probably want to permit in logical content mode the selection
of content based on generic XML features of non-TT namespace
descriptive markup, e.g., for applying style and timing semantics, in
which case an appropriate TT container element shall be implied based
on nearest ancestor TT namespace element.

Issue: Need to think about cascading semantics; how to express, how to
apply, etc. Possibly use CSS semantics here as well.

*** URLs ***

[1] http://www.w3.org/TR/xptr-xpointer/#document-order-notation

*** NEXT MEETING DATES ***

FEB 19, 2004  - telecon; regrets: [GA]
FEB 26, 2004  - telecon; regrets: [DS]
MAR 3-5, 2004 - f2f, Cannes

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
END SUMMARY
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

Thierry MICHEL
W3C/ERCIM

Received on Monday, 26 April 2004 10:12:41 UTC