Re: Issues with dynamicFlow from Glenn A. Adams on 2009-04-23 (public-tt@w3.org from April 2009)

From: Glenn A. Adams <gadams@xfsi.com>
Date: Thu, 23 Apr 2009 12:40:29 +0800
To: Sean Hayes <Sean.Hayes@microsoft.com>, Public TTWG List <public-tt@w3.org>
Message-ID: <C616123D.A57C%gadams@xfsi.com>
see inline [GA]

On 4/20/09 11:19 PM, "Sean Hayes" <Sean.Hayes@microsoft.com> wrote:

> I believe dynamicFlow and section B are not well specified, and do not fit
> with the model in that it seems to require state to be maintained between two
> synchronic slices.

[GA] In double checking the current algorithm's specification language, I
agree that it is not currenly well defined. There is a small, but
significant error in the flow timing calculation algorithm when specifying a
definite rate for flow interval functions. Also, there is need for a few
minor (but normative) clarifications.

I agree that it requires keeping state between synchronic document
construction time boundaries, and that this requirement was assumed in the
design of this feature. In particular, a region's flow buffer and the
presented (flowed) content in the region must be maintained across these
boundaries. This is not unreasonable, since regions are defined statically
declared and are distinct, even in circumstances where their temporal extent
brings them in and out of active use. Therefore, the properties of a region,
including its flow buffer and current presented (flowed) content can be
maintained over the whole timeline of the DFXP document instance.

> Some issues I have:
> 
> How is content comparison for the purposes of B2 step 2a-e determined?

[GA] my intention here is that the content of flow buffer at T(k) be
considered "different" from the content at T(k+1) if the set of glyph areas
that would be produced by the content at T(k) are different in any
significant way from the set glyph areas that would be produced by the
content at T(k+1), which includes (exclusively):

* the addition, removal, or change of a content character so as to produce a
different set of resulting glyphs;
* the addition, removal, or change of a whitespace content character so as
to produce a difference in the position of a glyph area; e.g., inserting or
removing whitespace that causes differences in line layout or glyph area
placement;
* the change of some presentation style so as to produce a difference in the
position of a glyph area; e.g., a change to the font family, font size, font
style, font weight, may produce a layout difference;

changes to other styles, such as background color, foreground color, text
decoration, etc., do not result in a change of glyph or position of a glyph,
and, therefore, have no semantic impact on the dynamic flow process;

to make this clear, I will add normative text that spells out the above
intentions;

> If it 
> is logical tree comparison, then how is difference/equality defined for two
> arbitrary trees which may or may not share a common subtree - what happens for
> example to style attributes in the tree; or can no style animation happen for
> scrolling text?
> 
> How do you determine which node in tree B should correspond to the node(s)
> which generated the "most logically prior content presently visible" - maybe
> bidi processing is required to reorder the due to an element being elided,
> does the scroll go back to that element?
> 
> What indeed is 'before' in this context, and what is 'logically prior'.
> 

[GA] by "logical content position" I had in mind a numbered, logical
sequence of content characters that correspond to the glyphs associated with
glyph areas; the logical position is based on the input character string
order, and not on the resulting (visual) glyph (display) order, so this
sequence is determined prior to bidi processing; I will add normative text
that makes this definition clear; N.B., the notion of logical order versus
visual (or presentation) order is discussed in some detail in [1];

[1] http://www.w3.org/TR/2005/REC-charmod-20050215/#sec-LogicalOrder

> How do pixel units fit into a logical tree comparison?

[GA] see my description of "different" content above;

> Is case 2a even possible? why would there be two synchronic slices for a
> region that have no differences? If it is possible, what happens for runs of 3
> or more such slices?

[GA] sure:

<tt:p>
<tt:span begin="0s" end="1s">-</tt:span>
<tt:span begin="1s" end="2s">X</tt:span>
<tt:span begin="2s" end="3s">X</tt:span>
<tt:span begin="3s" end="4s">X</tt:span>
<tt:span begin="4s" end="5s">X</tt:span>
<tt:span begin="5s" end="6s">-</tt:span>
...
</tt:p>

[0,1): -     ; T(k+0)
[1,2): X     ; T(k+1)
[2,3): X     ; T(k+2)
[3,4): X     ; T(k+3)
[4,5): X     ; T(k+4)
[5,6): -     ; T(k+5)

T(k+0) <> T(k+1) : TRUE
T(k+1) <> T(k+2) : FALSE
T(k+2) <> T(k+3) : FALSE
T(k+3) <> T(k+4) : FALSE
T(k+4) <> T(k+n) : TRUE

> I'm also not clear what times n where Tk < n < Tk+1 are actually defined
> (leaving aside the issue that no intermediate states are defined in
> discontinuous smpte mode, or how they would be rounded to smpte frames).

[GA] see first paragraph of 9.3.2:

"For the purposes of performing presentation processing, the active time
duration of a document instance is divided into a sequence of time
coordinates where at each time coordinate, some element becomes temporally
active or inactive..."

each such time coordinate constitutes a distinct 'k' value;
 
> Since the amount of content in the flow buffer seems to alter the fill rate.
> What happens when this changes the fill interval?

[GA] the new fill interval applies the next time the fill timer is started;

> Is it possible for the fill
> to effectively 'go back in time'?

[GA] no; the language of B.2 (4) and B.2 (5) are carefully worded so that
the affects of changes of content in the flow buffer due to differences
between synchronic intermediate documents are limited to those that occur
subsequently to the most logically subsequent content currently presented in
the presentation region undergoing dynamic flow;

in particular, B.2 (5) says "ignore first part" which refers to "that part
[of logical content] that wholly precedes the [logical] position that
corresponds with the most logically subsequent content presently visible in
the region";

this language is defined as is to prevent changes that arise from synchronic
document construction triggering changes in the currently presented (already
flowed) content;

> e.g Lets say we are computing the scroll state between two synchronic slices
> Tk=10s and TK+1 = 11s with a given flow rate of 2.
> 
> if at point Tk flow buffer contains 20 fill units; there are 10 logical steps.
> If at point TK+1 flow buffer contains 30 fill units; there are 15 logical
> steps.
> 
> Say we are at step 5, and we discover that one of the conditions in B4
> applies, the flow buffer changes, and we went from 5/10 (i.e. 10.5s - half way
> through the scroll) to 5/15 (10.333s - less than halfway); is the 'next' fill
> 'tick' 10.4s? if so then since this is less than 10.5 should not the flow
> buffer revert back to 20 units?. Or perhaps we skip to 10.533 - missing out a
> couple of steps half way through the second scroll? If on the other hand the
> flow buffer changed to be 10 units, would we be done?
> If we go backwards, should previously scrolled content reappear?

[GA] I see that the current spec does not indicate what timeline to use for
the purpose of interpreting time expressions in <flowIntervalFunction>,
namely, a <duration> argument to intra() or inter() flow interval functions;
it would seem appropriate that this should be defined to be the same
timeline as used for DFXP content, namely, as determined by ttp:timeBase; in
the case of SMPTE discontinuous (marker) mode, it probably should be real
time, since duration is not well defined on the discontinuous SMPTE
timeline; i will add normative text that calls out these timeline semantics;

regarding your example above, the flow timers control the times at which
content is flowed into and cleared out of the region; the durations of these
timers are based on the computed {fill,clear} intervals at the time they are
reset, as defined by B.5; since the value of a computed flow interval is
always non-negative, and since changes in synchronic content in the flow
buffer only have an effect if they are subsequent to the last presented
content, then all temporal effects are either at the current time or in the
future;

before elaborating the example, however, I note that there is a small but
semantically significant error in B.3 as pertains to the use of a "definite
rate", specifically B.3.1 (2) and B.3.2 (2);

at present, B.3.1 (2) and B.3.2 (2) states:

"if the value of the {fill,clear} interval parameter is a definite rate,
then the computed {fill,clear} interval is equal to the number of
{fill,clear} units currently available in the flow buffer divided by
specified rate (in {fill,clear} units per second)"

as you can see from your example, this language would result in a CFI
(computed fill interval) and CCI (computed clear interval) of 10 seconds at
T(10s), i.e., 20 {fill,clear} units / 2 {fill,clear} units per second = 10
seconds; clearly this is not what is desired;

instead, this language should be modified to read as follows:

"if the value of the {fill,clear} interval parameter is a definite rate,
then the computed {fill,clear} interval is equal to the inverse of the
specified rate (in {fill,clear} units per second)"

with this correction, and assuming flow interval functions of intra(2) and
inter(2) per your example, then the following holds:

FB @ T(10s) := { 1, ..., 20 }
CFI         := 1 / 2 fill units per second = 0.5s (per fill unit)
CCI         := 1 / 2 clear units per second = 0.5s (per clear unit)

FB @ T(11s) := { 3, ..., 32 }
CFI         := 1 / 2 fill units per second = 0.5s (per fill unit)
CCI         := 1 / 2 clear units per second = 0.5s (per clear unit)

now, in your example, you don't say where the changes occur between T(10s)
and T(11s); so i am assuming (for example sake) that all changes take the
form of appendations of content; this means that at T(11s), 12 new flow
units were added, since 2 flow units have been filled and cleared from the
original 20 by the end of the interval [10s,11s]; i.e., at T=11s, the second
of two fill/clear timer interval periods will expire, leaving 18 flow units
in the flow buffer; at this same time, i.e., T=11s, 12 new flow units would
be appended to the end of the flow buffer in order to make 30 flow units
present;

note that specifying a definite rate is identical to specifying the inverse
value as a definite duration; which argues for a possible simplification to
the syntax; namely, removing definite rate while leaving definite duration
(or vice-versa); however, i need to go back and look at earlier notes to
determine of there was some semantic distinction being made here that I no
longer recall;

if your example had instead left inter() and intra() unspecified, or if you
had specified inter(auto) and intra(auto), then the following would apply,
that is, if one assumes specific active durations for the content (which you
only obliquely implied in your example); let's assume that the original
content was as follows:

<tt:p begin="10s">
  <tt:span begin="0s" end="2s">ABCDEFGHIJKLMNOPQRST<tt:span>
  <tt:span begin="1s" end="3s">abcdefghijklmnopqrst<tt:span>
</tt:p>

we also need to further assume a fill and clear unit, which for this
augmented example, i will assume is 'character'; therefore, the fully
specified dynamic flow property would read as:

ttp:dynamicFlow="in(character) out(character) intra(auto) inter(auto)"

with this in mind, the following would hold:

FB @ T(10s) := { 'A', ..., 'T' }
CFI         := ( 12s - 10s ) / 20 = 2s / 20 = 0.1s (per fill unit)
CCI         := ( 12s - 10s ) / 20 = 2s / 20 = 0.1s (per clear unit)

FB @ T(11s) := { 'K', ..., 'T', 'a', ..., 't' }
CFI         := ( 13s - 10s ) / 30 = 3s / 30 = 0.1s (per fill unit)
CCI         := ( 13s - 10s ) / 30 = 3s / 30 = 0.1s (per clear unit)

notice here that CFI and CCI do not actually change at T(11s) due to the
specific active durations of the augmented example; of course, if we
lengthened the duration of the second span from 2 to 3 seconds, then we
would have:

<tt:p begin="10s">
  <tt:span begin="0s" end="2s">ABCDEFGHIJKLMNOPQRST<tt:span>
  <tt:span begin="1s" end="4s">abcdefghijklmnopqrst<tt:span>
</tt:p>

FB @ T(11s) := { 'K', ..., 'T', 'a', ..., 't' }
CFI         := ( 14s - 10s ) / 30 = 4s / 30 = 0.133s (per fill unit)
CCI         := ( 14s - 10s ) / 30 = 4s / 30 = 0.133s (per clear unit)

> There are even more interactions if the clear timing interval is also changing
> with respect to the fill buffer.

[GA] this is accounted for above, although i did not show and example where
content is cleared slower than it is filled; even in such a case, i believe
the current algorithm (with minor modifications described above) is well
defined; 

> The more I look at it, the more I feel that this mechanism is not a good fit
> with the SMIL/Timed text model.

[GA] since there is no alternative proposed model that is as formal as the
one currently defined, then i think we have no alternative if we want to
include this feature *and* want to specify it as formally as possible;
rather than throw the baby out with the bathwater, let's critique the
language and the algorithm itself in order to improve it if needed; we saw
above, that some minor corrections and clarifications are needed; let's
continue in this path, since no other path is open before us other than
simply throwing it out or defining something that is horribly
underspecified;
 
> IMO pixel level smooth scrolling aspects of dynamicFlow would be best modeled
> by animation of a canvas origin property on region (& perhaps introducing
> SMIL's <animate>). The character/word/line inflow/outflow is adequately
> modeled already by <span>.

[GA] i disagree; you cannot use explicit timing on span to obtain the same
results, for the simple reason that you (as an author) don't know the actual
geometry of the region and don't know the actual font metrics or line layout
algorithm used by a presentation processor; in many (most?) cases you will
end up specifying the geometry of a region in relative terms (percentages)
of an unknown external root container, i.e., similar to what is used in
determining safe area of television displays; you will also not be able to
ensure that font metrics will produce identical line layouts across
implementations;

one of the key design requirements of the dynamic flow feature was to have
the implementation compute the timing based on definite knowledge known only
at presentation time, and not at authoring time, in order to achieve
specific or constrained flow rates; you just can't do this with explicit
timing on TT elements directly;
 
> 
> Sean Hayes
> Media Accessibility Strategist
> Accessibility Business Unit
> Microsoft
> 
> -----Original Message-----
> From: Glenn A. Adams [mailto:gadams@xfsi.com]
> Sent: 19 April 2009 1:39 PM
> To: Sean Hayes; Public TTWG List
> Subject: Re: ISSUE-58 (showBackground animateable): shouBackground should not
> be animateable [DFXP 1.0]
> 
> 
> i will go ahead and make all style properties animatable; tts:dynamicFlow
> can be easily handled by defining that a change in its value causes a reset
> of the fill and clear flow timers; regarding dynamic flow having state
> across significant synchronic intermediate documents, i believe i have dealt
> with that previously in Section B.2;
> 
> g.
> 
> On 4/19/09 5:26 PM, "Sean Hayes" <Sean.Hayes@microsoft.com> wrote:
> 
>> Interesting you should say that, I had exactly the same thought last night.
>> One of the original design principles was that timed text display should be a
>> function of time, i.e. without state. The reasoning behind having attributes
>> non-animateable was that it might be too expensive in terms of re-flow etc,
>> but if at each moment in time the entire tree is effectively made anew. Then
>> this reasoning seems unsound.
>> 
>> So I support the motion.
>> 
>> The only one I have some doubts about is dynamicFlow, because it seems to
>> operate somewhat outside the same timeline, and thus have state across time
>> ticks. Which is also why I think dynamicFlow should be dropped, or
>> substantially reworked in order to fit with the above model.
>> 
>> Sean Hayes
>> Media Accessibility Strategist
>> Accessibility Business Unit
>> Microsoft
>> 
>> -----Original Message-----
>> From: public-tt-request@w3.org [mailto:public-tt-request@w3.org] On Behalf Of
>> Glenn A. Adams
>> Sent: 19 April 2009 7:11 AM
>> To: Public TTWG List
>> Subject: Re: ISSUE-58 (showBackground animateable): shouBackground should not
>> be animateable [DFXP 1.0]
>> 
>> i propose we take a different approach: make all styles animatable
>> 
>> i note that at present, the following are defined as being non-animatable:
>> 
>> tts:direction
>> tts:displayAlign
>> tts:dynamicFlow
>> tts:extent
>> tts:origin
>> tts:overflow
>> tts:unicodeBidi
>> tts:writingMode
>> 
>> in contrast, all of the (remaining) following properties are defined as
>> animatable:
>> 
>> tts:backgroundColor
>> tts:color
>> tts:display
>> tts:fontFamily
>> tts:fontSize
>> tts:fontStyle
>> tts:fontWeight
>> tts:lineHeight
>> tts:opacity
>> tts:padding
>> tts:showBackground
>> tts:textAlign
>> tts:textDecoration
>> tts:textOutline
>> tts:visibility
>> tts:wrapOption
>> tts:zIndex
>> 
>> there doesn't seem to be any principled reason for making any of the above
>> properties non-animatable; in fact, we have recently assumed that tts:origin
>> (and perhaps tts:extent) is animatable in order to move a region to a new
>> location; also, the following seem to be inconsistent on the surface:
>> 
>> * tts:textAlign is animatable, but tts:displayAlign is not
>> * tts:wrapOption is animatable, but tts:overflow is not
>> 
>> if one supports animation for one property, then it should be fairly trivial
>> to support animation on any other property;
>> 
>> therefore, i propose we make all the style properties animatable, which will
>> make usage and authoring less subject to special case exceptions;
>> 
>> glenn
>> 
>> On 4/18/09 3:03 AM, "Timed Text Working Group Issue Tracker"
>> <sysbot+tracker@w3.org> wrote:
>> 
>>> 
>>> ISSUE-58 (showBackground animateable): shouBackground should not be
>>> animateable [DFXP 1.0]
>>> 
>>> http://www.w3.org/AudioVideo/TT/tracker/issues/58
>>> 
>>> Raised by: Sean Hayes
>>> On product: DFXP 1.0
>>> 
>>> tts:showBackground is listed in the specification as animateable. I cant see
>>> why this is necessary. Unless we have a use case for this I propose it be
>>> set
>>> to animateable: none
>>> 
>>> 
>>> 
>> 
>> 
>> 
> 
> 
>
Received on Thursday, 23 April 2009 04:41:14 UTC