Re: Issues with dynamicFlow from Glenn A. Adams on 2009-04-24 (public-tt@w3.org from April 2009)

From: Glenn A. Adams <gadams@xfsi.com>
Date: Fri, 24 Apr 2009 18:20:42 +0800
To: Sean Hayes <Sean.Hayes@microsoft.com>, Public TTWG List <public-tt@w3.org>
Message-ID: <C617B37A.A5AE%gadams@xfsi.com>
inline

On 4/23/09 11:10 PM, "Sean Hayes" <Sean.Hayes@microsoft.com> wrote:

> OK, this clarifies quite a bit, getting better; but I still have some issues.
> 
> From your first elaborated example, its more clear that equality is not based
> on locating the 'same' element in a logical sub tree, but rather 'one that
> generates the same area' in the final rendering; however I'm still not clear
> that 'difference' is sensibly defined under this model.
> 
> I'm still not OK with the wording of B.2, for example consider the wording of
> B.2 2d :
> 
> "difference present, but only after the logical content position that
> corresponds with the most logically subsequent content presently visible in
> the region;"
> 
> What does 'presently visible' mean here?

[GA] 'presently' means at T(k) only, i.e., before introducing any change
from T(k+1); so 'presently visible' means "as flowed (presented) into region
at time T(k)";

the intent of B.2 2d is to capture changes that occur from T(k) to T(k+1)
that consist entirely of appendations of content to the flow buffer, and not
change or insertion of content that may be considered to precede (logically
speaking) the last visible content;

in contrast B.2 2e deals with this more complex case of changes or
insertions that affect content prior to the last visible content; thus, in
B.2 5, the case 2e is then further subdivided into two parts, one part that
affects content prior to the last visible content, and another part that
follows (logically) the last visible content; the former changes are then
ignored, while the latter are recognized; this reduces 2e's effects to be
the same as 2d, and it is then interpreted as such;
 
> I interpret it to means at the two synchronic slices (i.e. as would be
> observable if no dynamic flow were occurring); could we then thus collapse all
> cases a-e here to mean that there is a difference if and only if the glyph
> area layout generated by the content selected into the region at time Tk is
> 'different' to the glyph area layout at time Tk+1 generated by the content
> selected into the region at time (this has to be regardless of 'clipping' of
> the region).

[GA] merging 2a through 2e would have no positive benefit, and would instead
make it more difficult to follow; the 5 cases shown are in fact all 5
possible interpretations of changes of content, so it is best to maintain
them all in step 2;

consider the following logical content ordering at T(k)

1 S 2 E 3

where

1: logical content prior to S;

S: most prior logical content position of some visible glyph area in region;

2: logical content subsequent to or same position as S but prior to or same
position as E;

E: most subsequent logical content position of some visible glyph area in
region;

3: logical content subsequent to E;

now, we want to limit changes in synchronic content in the flow buffer to
only that subset of changes that intersect with 3;

therefore, we want to:

1. [B.2 2a + 3] ignore no changes case (no op)

2. [B.2 2b + 3] ignore changes to 1

3. [B.2 2c + 3] ignore changes to 2

4. [B.2 2d + 4] permit changes to 3

5. [B.2 2e + 5] permit subset of changes that affect more than one of { 1,
2, 3 }, with permitted subset of changes being only those that change 3
 
> (this would mean rewriting step 4, but I have problems with the current
> language there anyway)
> 
> 
> Issue 2:  difficulty in actually computing whether two layouts are in fact
> 'significantly different':
> 
> Issue 1, Consider:
> <tt:p>
> <tt:span tts:color='red' begin="0s" end="1s">-</tt:span>
> <tt:span tts:color='blue' begin="1s" end="2s">X</tt:span>
> <tt:span tts:color='red' begin="2s" end="3s">X</tt:span>
> <tt:span tts:color=' blue' begin="3s" end="4s">X</tt:span>
> <tt:span tts:color='red' begin="4s" end="5s">X</tt:span>
> <tt:span tts:color=' blue' begin="5s" end="6s">-</tt:span>
> ...
> </tt:p>
> 
> [0,1): -     ; T(k+0)
> [1,2): X     ; T(k+1)
> [2,3): X     ; T(k+2)
> [3,4): X     ; T(k+3)
> [4,5): X     ; T(k+4)
> [5,6): -     ; T(k+5)
> [6,-):       ; T(k+6)
> 
> T(k+0) <> T(k+1) : TRUE
> T(k+1) <> T(k+2) : FALSE
> T(k+2) <> T(k+3) : FALSE
> T(k+3) <> T(k+4) : FALSE
> T(k+4) <> T(k+5) : TRUE
> T(k+5) <> T(k+6) : TRUE
> 
> By your proposed new definition the flow buffer is only changed at times:
>  {0, p, q, r} where 0 < p <= 1, 4 < q <= 5, 5 < r <= 6

[GA] if by this, you mean:

0: T(k-1) and T(k+0) boundary, i.e., @ t=0s
p: T(k+0) and T(k+1) boundary, i.e., @ t=1s
q: T(k+4) and T(k+5) boundary, i.e., @ t=5s
r: T(k+6) and T(k+6) boundary, i.e., @ t=6s

then yes, those are the times where by my definition there would be a
'significant change';

> (PS. you misunderstood my question about what times are defined, 9.3.2 does
> not apply. Are times p, q, r potentially somewhere between the set of times
> defined by the synchronic slices, or do they snap to boundaries i.e. {0, 1, 5,
> 6} )

[GA] they are on the boundaries only; i'm not sure what you mean by "9.3.2
does not apply": since 9.3.2 is what is generating the synchronic
intermediate documents at each T(k), then it does indeed imply; otherwise,
there is only T(k) and no T(k+1);

> So we dynamicFlow from a red hyphen, to a blue X, to a blue hyphen to empty.
> The red X's being skipped. Thus 'difference' may in fact need to include all
> visual changes, or we need to update the flow buffer in some manner at some
> 'equivalent' Tk to accommodate these other changes.

[GA] no, not all visual changes, only those that affect fill and clear
operations, of which changing foreground color is not included; that is
handled by a separate layer;

> If this is the case, might it not be easier to define that the buffer is in
> fact changed at all Tk? Removing the issue of defining difference. (Which
> again pushes the issue onto step 4 how we transition between the FBs).

[GA] somehow we need to define which synchronic changes are to be ignored
and which are not; i don't see how to do this without defining 'difference';

> To further illustrate the complexity of defining 'difference':
> 
> <tt:p>
> <tt:span fontSize="2c" begin="0s" end="1s">-</tt:span>
> <tt:span fontSize="30px" begin="1s" end="2s">X</tt:span>
> <tt:span fontSize="2c" begin="2s" end="3s">X</tt:span>
> <tt:span fontSize="30px" begin="3s" end="4s">X</tt:span>
> <tt:span fontSize="2c" begin="4s" end="5s">X</tt:span>
> <tt:span fontSize="30px" begin="5s" end="6s">-</tt:span>
> ...
> </tt:p>
> 
> It may be the case that <span fontSize="2c">X</span> and <span
> fontSize="30px">X</span> generate glyphs which have identical areas, it may
> not. Even between span 3 and span 5, it's possible the definition of c may
> have changed. It is clear that the X's are not coming from the same logical
> string in the input. This seems inconsistent with your concept of:

[GA] if there is no change in glyph size, i.e., it has the same layout
position and same font size, then this would not constitute a 'signficant
difference';

> "a numbered, logical sequence of content characters that correspond to the
> glyphs associated with
> glyph areas; the logical position is based on the input character string
> order, and not on the resulting (visual) glyph (display) order"

[GA] you notice I entered an issue 63 to formally define 'logical content
position' (for the purpose of interpreting 'prior' and 'subsequent'); i have
done that already in my editor's copy, which i may be able to update today;

i have defined logical content position as a tuple:

[
  active duration begin time,
  active duration end time,
  character information item index
]

where ordering is defined as:

// is p1 prior to p0?
LCPIsPrior(LCP p0, LCP p1)
{
  if ( p1.begin < p0.begin )
    return true;
  else if ( p1.begin == p0.begin )
    return ( p1.index < p0.index ) )
  else
    return false;
}

// is p1 subsequent to p0?
LCPIsSubsequent(LCP p0, LCP p1)
{
  if ( p1.begin > p0.begin )
    return true;
  else if ( p1.begin == p0.begin )
    return ( p1.index > p0.index );
  else
    return false;
}

// global ordering of p1 w.r.t. p0:
// return -1 if p1 precedes p0, 1 if p1 follows p0; otherwise, 0
LCPCompare(LCP p0, LCP p1)
{
  if ( LCPIsPrior ( p0,  p1 ) )
    return -1;
  else if ( LCPIsSubsequent ( p0, p1 ) )
    return 1;
  else
    return 0;
}

> We appear to need therefore to layout Tk and then keep that layout in memory
> (in some logical & measurable, as opposed to pixel based, manner) and compute
> the layout for each Ti until we find a measurable difference. -- this is way
> too complicated in my opinion.

[GA] I expect that most user agent implementations will keep the following
state over a document's lifetime if they wish to implement DFXP animation
and/or dynamic flow:

* DOM representation of source document tree, augmented by computed style
properties;

* (optionally) DOM representation of current and previous synchronous
intermediate document instance;

* Area tree representation of formatted content (not all of which needs to
be visible at any given time); this includes line areas and glyph areas with
their 'traits' as derived from computed style properties and the layout
process;

The XFSI DFXP viewer maintains the first and last of the above three states
(but not the intermediate document instances), and uses it for animation and
dynamic flow.

The current DFXP spec does not mandate a particular implementation for a
user agent that supports animation or dynamic flow, it merely defines a
processing model with certain semantics that can be satisfied by many
alternative implementations.

I can imagine an implementation that would not keep the area tree in memory.
Rather, it would have to recompute the area tree for T(k) prior to applying
changes at T(k+1). So it is not a logical necessity to maintain such state.

> Your second example is very instructive, I have tried to elaborate it further
> and it seems the crux of my difficulty lies in the definition of step 4 in
> B.2.
> 
> ttp:dynamicFlow="in(character) out(character) intra(auto) inter(auto)"
> 
> <tt:p begin="10s">
>   <tt:span begin="0s" end="2s">ABCDE GHIJK MNOPQ ST<tt:span>
>   <tt:span begin="1s" end="3s">abcde ghijk mnopq st<tt:span>
> </tt:p>
> 
> It seems to me in this example we have:
> [10,11): ABCDE GHIJK MNOPQ ST                            ; T(k+0)
> 
> Layout:
> ABCDE
> GHIJK
> MNOPQ
> ST

[GA] By "layout" here I guess you mean what I have been describing as "the
set of glyph areas that would be produced by...". If that is so, and if we
ignore the line breaking aspect you show (and also insert a whitespace
between 'T' and 'a'), then i'll go accept this as a reasonable result of
layout for the purpose of determining 'significant difference' between
synchronic time slices.

> 
> [11,12): ABCDE GHIJK MNOPQ STabcde ghijk mnopq st        ; T(k+1)
> Layout:
> ABCDE
> GHIJK
> MNOPQ
> STabcde
> ghijk
> mnopq
> st
> 
> [12,13): abcdef ghijkl mnopqr st                         ; T(k+2)
> Layout:
> abcde
> ghijk
> mnopq
> st
> 
> [13,-):                                                  ; T(k+3)
> 
> So we have layout differences
> at T(k+0) T(k+1) T(k+2) T(k+3)
> 
> Thus compute the flow buffers as:
> FB @ T(10s) := { 'A', ..., 'T' }
> FB @ T(11s) := { 'A', ..., 'T', 'a', ..., 't' }
> FB @ T(12s) := { 'a', ..., 't' }
> FB @ T(13s) := { }
> 
> @10s
> CFI         := ( 12s - 10s ) / 20 = 2s / 20 = 0.1s (per fill unit)
> CCI         := ( 12s - 10s ) / 20 = 2s / 20 = 0.1s (per clear unit)
> 
> FB @ T(10.1s) := { 'B', ..., 'T'}
> 
> Display [10-10.1):
> A
> 
> FB @ T(10.2s) := { 'C', ..., 'T'}
> Display [10.1-01.2):
> B
> ...
> 
> FB @ T(10.9s) := { 'J', ..., 'T'}
> Display [10.8-10.9):
> I
> 
> now we need to change the buffer:
> "such that the position of the content that corresponds with the start of the
> flow buffer does not change with respect to the corresponding position of the
> previous content."
> 
> Which is not very clear, but I now take is intended to mean that, rather than:
> FB @ T(11s) := { 'A', ..., 'T', 'a', ..., 't' }
>
> We get:
> FB @ T(11s) := { 'K', ..., 'T', 'a', ..., 't' }

[GA] I agree it is not clear enough, but I see you have indeed interpreted
it as I intended.

I have been re-reading this text a number of times over the past few days
attempting to figure out the best way to improve it. What I had done earlier
today is to insert the word 'affected' as follows:

"If condition 2d applies, then instantaneously replace the affected content
of the flow buffer..."

If we elaborate in detail the contents of the "content produced for use in
the region between time T(k) and T(k+1)", then we have the following
(showing the full logical content position of each character information
item):

AVAILABLE CONTENT @ T(10s)

[10,12, 0]: 'A'
[10,12, 1]: 'B'
[10,12, 2]: 'C'
[10,12, 3]: 'D'
[10,12, 4]: 'E'
[10,12, 5]: ' '
[10,12, 6]: 'G'
[10,12, 7]: 'H'
[10,12, 8]: 'I'
[10,12, 9]: 'J'
[10,12,10]: 'K'
[10,12,11]: ' '
[10,12,12]: 'M'
[10,12,13]: 'N'
[10,12,14]: 'O'
[10,12,15]: 'P'
[10,12,16]: 'Q'
[10,12,17]: ' '
[10,12,18]: 'S'
[10,12,19]: 'T'

AVAILABLE CONTENT @ T(11s)

[10,12, 0]: 'A'
[10,12, 1]: 'B'
[10,12, 2]: 'C'
[10,12, 3]: 'D'
[10,12, 4]: 'E'
[10,12, 5]: ' '
[10,12, 6]: 'G'
[10,12, 7]: 'H'
[10,12, 8]: 'I'
[10,12, 9]: 'J'
[10,12,10]: 'K'
[10,12,11]: ' '
[10,12,12]: 'M'
[10,12,13]: 'N'
[10,12,14]: 'O'
[10,12,15]: 'P'
[10,12,16]: 'Q'
[10,12,17]: ' '
[10,12,18]: 'S'
[10,12,19]: 'T'
[10,13,20]: ' '
[11,13,21]: 'a'
[11,13,22]: 'b'
[11,13,23]: 'c'
[11,13,24]: 'd'
[11,13,25]: 'e'
[11,13,26]: ' '
[11,13,27]: 'g'
[11,13,28]: 'h'
[11,13,29]: 'i'
[11,13,30]: 'j'
[11,13,31]: 'k'
[11,13,32]: ' '
[11,13,33]: 'm'
[11,13,34]: 'n'
[11,13,35]: 'o'
[11,13,36]: 'p'
[11,13,37]: 'q'
[11,13,38]: ' '
[11,13,39]: 's'
[11,13,40]: 't'

Now, at T(11s), we have just cleared out 'J' and have filled 'K', so the
visible region contains only 'K', which has a logical content position of
[10,12,10]. Therefore, "the logical content position that corresponds with
the most logically subsequent content presently visible in the region" is
[10,12,10].

Now, we can see that the first 20 items of available content listed above
[at T(10s) and T(11s)] are equal and there is no other significant
difference that affects these first 20 items. Therefore, since the
significant differences appear only after [10,12,10], namely, starting at
[10,13,20], B.2 2d and B.2 4 apply.

At T(11s) we have the following contents in the flow buffer just prior to
effecting B.2 4.

FLOW BUFFER @ T(11s - epsilon), i.e., after clearing 'J', and filling 'K',
but immediately prior to updating FB due to synchronic content change:

[10,12,10]: 'K'
[10,12,11]: ' '
[10,12,12]: 'M'
[10,12,13]: 'N'
[10,12,14]: 'O'
[10,12,15]: 'P'
[10,12,16]: 'Q'
[10,12,17]: ' '
[10,12,18]: 'S'
[10,12,19]: 'T'

So, interpreting B.2 4, we need to replace the 'affected content', and we
need to do this in such a manner that the logical content position of the
content that corresponds with the start of the flow buffer does not change
with respect to the corresponding logical content position of the previous
content.

Now, the "logical content position of the content that corresponds with the
start of the flow buffer" is [10,12,10]. So we need to ensure that the newly
added content does not change this. The only way to do that is to replace
the content of the flow buffer with the new available content [at T(11s)]
starting at the same logical content position, which amounts to using only
the last 31 items of the new available content. That gives us the new flow
buffer content as follows:

FLOW BUFFER @ T(11s), i.e., after updating FB due to synchronic content
change:

[10,12,10]: 'K'
[10,12,11]: ' '
[10,12,12]: 'M'
[10,12,13]: 'N'
[10,12,14]: 'O'
[10,12,15]: 'P'
[10,12,16]: 'Q'
[10,12,17]: ' '
[10,12,18]: 'S'
[10,12,19]: 'T'
[10,13,20]: ' '
[11,13,21]: 'a'
[11,13,22]: 'b'
[11,13,23]: 'c'
[11,13,24]: 'd'
[11,13,25]: 'e'
[11,13,26]: ' '
[11,13,27]: 'g'
[11,13,28]: 'h'
[11,13,29]: 'i'
[11,13,30]: 'j'
[11,13,31]: 'k'
[11,13,32]: ' '
[11,13,33]: 'm'
[11,13,34]: 'n'
[11,13,35]: 'o'
[11,13,36]: 'p'
[11,13,37]: 'q'
[11,13,38]: ' '
[11,13,39]: 's'
[11,13,40]: 't'

This appears to correspond closely to what you derived above, except for the
additional:

[10,13,20]: ' '

which comes from the significant XML whitespace between the two spans.

> In which case: we should define this more clearly somehow. Essentially what
> you seem to be saying is find the logical fill unit at the head of the current
> flow buffer  ('J' in this case), and discard fill units from the front of the
> incoming FB up to and including the one that 'corresponds to' that fill unit.
> If there is no such corresponding unit, just use the incoming buffer.

[GA] Yes; except it would be 'K' rather than 'J', since the effects of clear
and fill events need to be implemented prior to the change due to synchronic
content change.

> If we can define 'corresponds to' here, then we may be able to dispense with
> most of the problematic wording in B.2
> 
> Now we can recompute
> CFI         := ( 13s - 10s ) / 30 = 3s / 30 = 0.1s (per fill unit)
> CCI         := ( 13s - 10s ) / 30 = 3s / 30 = 0.1s (per clear unit)

[GA] Close, but we need to account for the additional ' ' item, so it would
be (13s - 10s) / 31 = 0.9677s.

> When we get to:
> FB @ T(11.9s) := { 'T', 'a', ..., 't' }
> Display [10.8-10.9):
> I

[GA] I think you mean:

Display [11.8-11.9):
S

> FB @ T(12s) := { 'a', ..., 't' }

[GA] With the medial ' ', it would be:

FB @ T(12s) := { ' ', 'a', ..., 't' }

> So I think I'm ok with this as it regards 'logical' fill units, I'm still
> struggling with how this all applies to pixel sized fill units.

[GA] Good. I think we are aligned in our understanding at this point.

Regarding "pixel" as a flowUnit, I would have no objection to removing it. I
think it does not presently have sufficient definition to use effectively.
For example, which edge of a glyph area or line area are we talking about
flowing in/out? The flowTransitionStyle does not help us to resolve this
question, since it applies to flowTransition only, which presently only
includes "barWipe".

> Fill Operation:
> From this section it's unclear whether the flow buffer is a logical (i.e.
> "equivalent (in form) to the content of an fo:block-container element")
> structure such as an XML infoset, or a pixel/area level structure. If the
> former, how do I remove a pixel sized fill unit? If the latter, then I don't
> understand the concept of 'equivalence' you are using in the above quote, and
> I'm not seeing how the fill and clear rates can be different for a static
> region area.

[GA] I agree that the present language in B.4 does not sufficiently
distinguish between logical content (character information items) and
formatted content (line areas consisting of glyph areas). I will add an
issue to clarify this language„

> Regarding your closing remarks:
> "you cannot use explicit timing on span to obtain the same
> results, for the simple reason that you (as an author) don't know the actual
> geometry of the region and don't know the actual font metrics or line layout
> algorithm used by a presentation processor"
> 
> For character and word flow, intermediate timing on span seems to me entirely
> equivalent, one does not need to know anything about the geometry or metrics.
> As the normal flow processing will compute everything.
> 
> For line rate, I agree you would not know where the line breaks might occur,
> so you cannot put explicit spans in place to time the presence of lines unless
> you also take ownership of the line breaks with <br> and <p> elements; however
> that seems entirely reasonable to me under the constrained requirements of
> dfxp.

[GA] I agree that you can implement a subset of the current dynamic flow
processing if you only work with logical content units (fo:character,
fo:inline, fo:block).

> But even if we are to keep the dynamicFlow concept; this is starting to seem
> to me to be orthogonal to the issue of defining a smooth scroll for the
> purposes of overflow='scroll'
> 
> We had the following two requirements:
> 
> "The temporal fill mode parameter permits specifying the granularity of
> temporal filling, e.g., line, word, character. The temporal fill direction
> parameter permits specifying the direction of fill (stacking) independently
> from the writing mode. The temporal block clear mode parameter permits
> specifying whether the containing block is cleared or is automatically
> scrolled, and by what extent, when the block is filled. The temporal fill
> interval parameter permits specifying the interval that a filled area should
> remain static before processing the next fill. The temporal inter-fill
> interval parameter permits specifying the interval between the end of a prior
> fill interval and the start of a subsequent fill interval."
> 
> "The TT AF shall be capable of expressing animated scrolling of content, both
> in block and inline progression directions, with independent expression of
> scroll in, scroll out, and scroll repetition."
> 
> The first of these does seem to me satisfied either by the explicit use of
> time selection of content if we require explicit line breaks, or by the flow
> buffer concept if we adopt dynamicFlow [Although IMO a future profile might be
> better suited to this more complex functionality of reflow occurring between
> synchronic slices].

[GA] We discussed the explicit use of line breaks some time ago in a face to
face meeting, perhaps at Kingswood Warren, and ruled out that approach at
the time. We also ruled out the "knife and fork" approach which you have
been discussing (the use of explicit timing on character or words).

> Animation of a canvas origin property once a region is laid out for a
> synchronic slice seems to me entirely suitable for meeting the second
> requirement of scroll (which incidentally does not seem to be met by the
> existing text as it fails to deal with selection of direction or repetition).
> Since scroll is more normally associated with this kind of functionality than
> messing with pixel level flow units.

[GA] This would require introduce a clipping region. But it would not
support the current model, which allows content to flow out following lines
into previous lines, e.g., given 2 rows of 2 cells, the current model
supports:

- -
- -

- -
- 0

- -
0 1

- 0
1 2

0 1
2 3

1 2
3 -

2 3
- -

3 -
- -

- -
- -

In general, you can't do this with either region origin animation behind a
clip region or with explicit timing (for some flow units, e.g., glyph).

> The CSS model defines layout onto an infinite canvas, viewed through a
> viewport. The region in Timed text seems to correspond to the idea of a
> viewport:
> 
> "User agents for continuous media generally offer users a viewport (a window
> or other viewing area on the screen) through which users consult a document.
> User agents may change the document's layout when the viewport is resized (see
> the initial containing block). When the viewport is smaller than the
> document's initial containing block, the user agent should offer a scrolling
> mechanism. There is at most one viewport per canvas, but user agents may
> render to more than one canvas (i.e., provide different views of the same
> document)."
> 
> To formally specify this, we need to make that mapping explicit in the spec (a
> good idea anyway IMO) and define a property scrollMode thus:
> 
> scrollMode:
> Values: <digit>+ ((<length> | auto) (<length> | auto) (<duration> | auto)
> (<duration> | auto))?
> Initial:        1 auto auto auto auto
> Applies to:     region
> Inherited:      no
> Percentages: relative to width and height of region
> 
> where the lengths are the distance of scroll in the block and inline direction
> respectively over the duration, auto in this context meaning the difference in
> the dimension of the initial containing block and the dimension of the
> viewport. And the durations are either explicit times or auto, meaning Tk+1 -
> Tk. At Time Tk + i the offset of the position of the initial containing block
> with respect to the viewport is defined by : (length / duration) * i. The
> number of steps of i over the duration is defined by the initial <digit>+.
> 
> I believe the above is neither horribly underspecified, nor incompatible with
> a simplified version of dynamic flow which concentrates solely on the inter
> slice reflow issue.

[GA] I'm opposed to an entire redesign of the feature now, particularly in a
manner that discards previous agreements about the feature's semantics:
including ability to not have to resort to knife and fork timing with
explicit line breaking.

I am prepared to fix the specification language of the feature as already
defined, with minor improvements as need to ensure the semantics are
sufficiently clear to obtain interoperable implementations.

> -----Original Message-----
> From: Glenn A. Adams [mailto:gadams@xfsi.com]
> Sent: 23 April 2009 5:40 AM
> To: Sean Hayes; Public TTWG List
> Subject: Re: Issues with dynamicFlow
> 
> 
> see inline [GA]
> 
> On 4/20/09 11:19 PM, "Sean Hayes" <Sean.Hayes@microsoft.com> wrote:
> 
>> I believe dynamicFlow and section B are not well specified, and do not fit
>> with the model in that it seems to require state to be maintained between two
>> synchronic slices.
> 
> [GA] In double checking the current algorithm's specification language, I
> agree that it is not currenly well defined. There is a small, but
> significant error in the flow timing calculation algorithm when specifying a
> definite rate for flow interval functions. Also, there is need for a few
> minor (but normative) clarifications.
> 
> I agree that it requires keeping state between synchronic document
> construction time boundaries, and that this requirement was assumed in the
> design of this feature. In particular, a region's flow buffer and the
> presented (flowed) content in the region must be maintained across these
> boundaries. This is not unreasonable, since regions are defined statically
> declared and are distinct, even in circumstances where their temporal extent
> brings them in and out of active use. Therefore, the properties of a region,
> including its flow buffer and current presented (flowed) content can be
> maintained over the whole timeline of the DFXP document instance.
> 
>> Some issues I have:
>> 
>> How is content comparison for the purposes of B2 step 2a-e determined?
> 
> [GA] my intention here is that the content of flow buffer at T(k) be
> considered "different" from the content at T(k+1) if the set of glyph areas
> that would be produced by the content at T(k) are different in any
> significant way from the set glyph areas that would be produced by the
> content at T(k+1), which includes (exclusively):
> 
> * the addition, removal, or change of a content character so as to produce a
> different set of resulting glyphs;
> * the addition, removal, or change of a whitespace content character so as
> to produce a difference in the position of a glyph area; e.g., inserting or
> removing whitespace that causes differences in line layout or glyph area
> placement;
> * the change of some presentation style so as to produce a difference in the
> position of a glyph area; e.g., a change to the font family, font size, font
> style, font weight, may produce a layout difference;
> 
> changes to other styles, such as background color, foreground color, text
> decoration, etc., do not result in a change of glyph or position of a glyph,
> and, therefore, have no semantic impact on the dynamic flow process;
> 
> to make this clear, I will add normative text that spells out the above
> intentions;
> 
>> If it
>> is logical tree comparison, then how is difference/equality defined for two
>> arbitrary trees which may or may not share a common subtree - what happens
>> for
>> example to style attributes in the tree; or can no style animation happen for
>> scrolling text?
>> 
>> How do you determine which node in tree B should correspond to the node(s)
>> which generated the "most logically prior content presently visible" - maybe
>> bidi processing is required to reorder the due to an element being elided,
>> does the scroll go back to that element?
>> 
>> What indeed is 'before' in this context, and what is 'logically prior'.
>> 
> 
> [GA] by "logical content position" I had in mind a numbered, logical
> sequence of content characters that correspond to the glyphs associated with
> glyph areas; the logical position is based on the input character string
> order, and not on the resulting (visual) glyph (display) order, so this
> sequence is determined prior to bidi processing; I will add normative text
> that makes this definition clear; N.B., the notion of logical order versus
> visual (or presentation) order is discussed in some detail in [1];
> 
> [1] http://www.w3.org/TR/2005/REC-charmod-20050215/#sec-LogicalOrder
> 
>> How do pixel units fit into a logical tree comparison?
> 
> [GA] see my description of "different" content above;
> 
>> Is case 2a even possible? why would there be two synchronic slices for a
>> region that have no differences? If it is possible, what happens for runs of
>> 3
>> or more such slices?
> 
> [GA] sure:
> 
> <tt:p>
> <tt:span begin="0s" end="1s">-</tt:span>
> <tt:span begin="1s" end="2s">X</tt:span>
> <tt:span begin="2s" end="3s">X</tt:span>
> <tt:span begin="3s" end="4s">X</tt:span>
> <tt:span begin="4s" end="5s">X</tt:span>
> <tt:span begin="5s" end="6s">-</tt:span>
> ...
> </tt:p>
> 
> [0,1): -     ; T(k+0)
> [1,2): X     ; T(k+1)
> [2,3): X     ; T(k+2)
> [3,4): X     ; T(k+3)
> [4,5): X     ; T(k+4)
> [5,6): -     ; T(k+5)
> 
> T(k+0) <> T(k+1) : TRUE
> T(k+1) <> T(k+2) : FALSE
> T(k+2) <> T(k+3) : FALSE
> T(k+3) <> T(k+4) : FALSE
> T(k+4) <> T(k+n) : TRUE
> 
>> I'm also not clear what times n where Tk < n < Tk+1 are actually defined
>> (leaving aside the issue that no intermediate states are defined in
>> discontinuous smpte mode, or how they would be rounded to smpte frames).
> 
> [GA] see first paragraph of 9.3.2:
> 
> "For the purposes of performing presentation processing, the active time
> duration of a document instance is divided into a sequence of time
> coordinates where at each time coordinate, some element becomes temporally
> active or inactive..."
> 
> each such time coordinate constitutes a distinct 'k' value;
> 
>> Since the amount of content in the flow buffer seems to alter the fill rate.
>> What happens when this changes the fill interval?
> 
> [GA] the new fill interval applies the next time the fill timer is started;
> 
>> Is it possible for the fill
>> to effectively 'go back in time'?
> 
> [GA] no; the language of B.2 (4) and B.2 (5) are carefully worded so that
> the affects of changes of content in the flow buffer due to differences
> between synchronic intermediate documents are limited to those that occur
> subsequently to the most logically subsequent content currently presented in
> the presentation region undergoing dynamic flow;
> 
> in particular, B.2 (5) says "ignore first part" which refers to "that part
> [of logical content] that wholly precedes the [logical] position that
> corresponds with the most logically subsequent content presently visible in
> the region";
> 
> this language is defined as is to prevent changes that arise from synchronic
> document construction triggering changes in the currently presented (already
> flowed) content;
> 
>> e.g Lets say we are computing the scroll state between two synchronic slices
>> Tk=10s and TK+1 = 11s with a given flow rate of 2.
>> 
>> if at point Tk flow buffer contains 20 fill units; there are 10 logical
>> steps.
>> If at point TK+1 flow buffer contains 30 fill units; there are 15 logical
>> steps.
>> 
>> Say we are at step 5, and we discover that one of the conditions in B4
>> applies, the flow buffer changes, and we went from 5/10 (i.e. 10.5s - half
>> way
>> through the scroll) to 5/15 (10.333s - less than halfway); is the 'next' fill
>> 'tick' 10.4s? if so then since this is less than 10.5 should not the flow
>> buffer revert back to 20 units?. Or perhaps we skip to 10.533 - missing out a
>> couple of steps half way through the second scroll? If on the other hand the
>> flow buffer changed to be 10 units, would we be done?
>> If we go backwards, should previously scrolled content reappear?
> 
> [GA] I see that the current spec does not indicate what timeline to use for
> the purpose of interpreting time expressions in <flowIntervalFunction>,
> namely, a <duration> argument to intra() or inter() flow interval functions;
> it would seem appropriate that this should be defined to be the same
> timeline as used for DFXP content, namely, as determined by ttp:timeBase; in
> the case of SMPTE discontinuous (marker) mode, it probably should be real
> time, since duration is not well defined on the discontinuous SMPTE
> timeline; i will add normative text that calls out these timeline semantics;
> 
> regarding your example above, the flow timers control the times at which
> content is flowed into and cleared out of the region; the durations of these
> timers are based on the computed {fill,clear} intervals at the time they are
> reset, as defined by B.5; since the value of a computed flow interval is
> always non-negative, and since changes in synchronic content in the flow
> buffer only have an effect if they are subsequent to the last presented
> content, then all temporal effects are either at the current time or in the
> future;
> 
> before elaborating the example, however, I note that there is a small but
> semantically significant error in B.3 as pertains to the use of a "definite
> rate", specifically B.3.1 (2) and B.3.2 (2);
> 
> at present, B.3.1 (2) and B.3.2 (2) states:
> 
> "if the value of the {fill,clear} interval parameter is a definite rate,
> then the computed {fill,clear} interval is equal to the number of
> {fill,clear} units currently available in the flow buffer divided by
> specified rate (in {fill,clear} units per second)"
> 
> as you can see from your example, this language would result in a CFI
> (computed fill interval) and CCI (computed clear interval) of 10 seconds at
> T(10s), i.e., 20 {fill,clear} units / 2 {fill,clear} units per second = 10
> seconds; clearly this is not what is desired;
> 
> instead, this language should be modified to read as follows:
> 
> "if the value of the {fill,clear} interval parameter is a definite rate,
> then the computed {fill,clear} interval is equal to the inverse of the
> specified rate (in {fill,clear} units per second)"
> 
> with this correction, and assuming flow interval functions of intra(2) and
> inter(2) per your example, then the following holds:
> 
> FB @ T(10s) := { 1, ..., 20 }
> CFI         := 1 / 2 fill units per second = 0.5s (per fill unit)
> CCI         := 1 / 2 clear units per second = 0.5s (per clear unit)
> 
> FB @ T(11s) := { 3, ..., 32 }
> CFI         := 1 / 2 fill units per second = 0.5s (per fill unit)
> CCI         := 1 / 2 clear units per second = 0.5s (per clear unit)
> 
> now, in your example, you don't say where the changes occur between T(10s)
> and T(11s); so i am assuming (for example sake) that all changes take the
> form of appendations of content; this means that at T(11s), 12 new flow
> units were added, since 2 flow units have been filled and cleared from the
> original 20 by the end of the interval [10s,11s]; i.e., at T=11s, the second
> of two fill/clear timer interval periods will expire, leaving 18 flow units
> in the flow buffer; at this same time, i.e., T=11s, 12 new flow units would
> be appended to the end of the flow buffer in order to make 30 flow units
> present;
> 
> note that specifying a definite rate is identical to specifying the inverse
> value as a definite duration; which argues for a possible simplification to
> the syntax; namely, removing definite rate while leaving definite duration
> (or vice-versa); however, i need to go back and look at earlier notes to
> determine of there was some semantic distinction being made here that I no
> longer recall;
> 
> if your example had instead left inter() and intra() unspecified, or if you
> had specified inter(auto) and intra(auto), then the following would apply,
> that is, if one assumes specific active durations for the content (which you
> only obliquely implied in your example); let's assume that the original
> content was as follows:
> 
> <tt:p begin="10s">
>   <tt:span begin="0s" end="2s">ABCDEFGHIJKLMNOPQRST<tt:span>
>   <tt:span begin="1s" end="3s">abcdefghijklmnopqrst<tt:span>
> </tt:p>
> 
> we also need to further assume a fill and clear unit, which for this
> augmented example, i will assume is 'character'; therefore, the fully
> specified dynamic flow property would read as:
> 
> ttp:dynamicFlow="in(character) out(character) intra(auto) inter(auto)"
> 
> with this in mind, the following would hold:
> 
> FB @ T(10s) := { 'A', ..., 'T' }
> CFI         := ( 12s - 10s ) / 20 = 2s / 20 = 0.1s (per fill unit)
> CCI         := ( 12s - 10s ) / 20 = 2s / 20 = 0.1s (per clear unit)
> 
> FB @ T(11s) := { 'K', ..., 'T', 'a', ..., 't' }
> CFI         := ( 13s - 10s ) / 30 = 3s / 30 = 0.1s (per fill unit)
> CCI         := ( 13s - 10s ) / 30 = 3s / 30 = 0.1s (per clear unit)
> 
> notice here that CFI and CCI do not actually change at T(11s) due to the
> specific active durations of the augmented example; of course, if we
> lengthened the duration of the second span from 2 to 3 seconds, then we
> would have:
> 
> <tt:p begin="10s">
>   <tt:span begin="0s" end="2s">ABCDEFGHIJKLMNOPQRST<tt:span>
>   <tt:span begin="1s" end="4s">abcdefghijklmnopqrst<tt:span>
> </tt:p>
> 
> FB @ T(11s) := { 'K', ..., 'T', 'a', ..., 't' }
> CFI         := ( 14s - 10s ) / 30 = 4s / 30 = 0.133s (per fill unit)
> CCI         := ( 14s - 10s ) / 30 = 4s / 30 = 0.133s (per clear unit)
> 
>> There are even more interactions if the clear timing interval is also 
>> changing
>> with respect to the fill buffer.
> 
> [GA] this is accounted for above, although i did not show and example where
> content is cleared slower than it is filled; even in such a case, i believe
> the current algorithm (with minor modifications described above) is well
> defined;
> 
>> The more I look at it, the more I feel that this mechanism is not a good fit
>> with the SMIL/Timed text model.
> 
> [GA] since there is no alternative proposed model that is as formal as the
> one currently defined, then i think we have no alternative if we want to
> include this feature *and* want to specify it as formally as possible;
> rather than throw the baby out with the bathwater, let's critique the
> language and the algorithm itself in order to improve it if needed; we saw
> above, that some minor corrections and clarifications are needed; let's
> continue in this path, since no other path is open before us other than
> simply throwing it out or defining something that is horribly
> underspecified;
> 
>> IMO pixel level smooth scrolling aspects of dynamicFlow would be best modeled
>> by animation of a canvas origin property on region (& perhaps introducing
>> SMIL's <animate>). The character/word/line inflow/outflow is adequately
>> modeled already by <span>.
> 
> [GA] i disagree; you cannot use explicit timing on span to obtain the same
> results, for the simple reason that you (as an author) don't know the actual
> geometry of the region and don't know the actual font metrics or line layout
> algorithm used by a presentation processor; in many (most?) cases you will
> end up specifying the geometry of a region in relative terms (percentages)
> of an unknown external root container, i.e., similar to what is used in
> determining safe area of television displays; you will also not be able to
> ensure that font metrics will produce identical line layouts across
> implementations;
> 
> one of the key design requirements of the dynamic flow feature was to have
> the implementation compute the timing based on definite knowledge known only
> at presentation time, and not at authoring time, in order to achieve
> specific or constrained flow rates; you just can't do this with explicit
> timing on TT elements directly;
> 
>> 
>> Sean Hayes
>> Media Accessibility Strategist
>> Accessibility Business Unit
>> Microsoft
>> 
>> -----Original Message-----
>> From: Glenn A. Adams [mailto:gadams@xfsi.com]
>> Sent: 19 April 2009 1:39 PM
>> To: Sean Hayes; Public TTWG List
>> Subject: Re: ISSUE-58 (showBackground animateable): shouBackground should not
>> be animateable [DFXP 1.0]
>> 
>> 
>> i will go ahead and make all style properties animatable; tts:dynamicFlow
>> can be easily handled by defining that a change in its value causes a reset
>> of the fill and clear flow timers; regarding dynamic flow having state
>> across significant synchronic intermediate documents, i believe i have dealt
>> with that previously in Section B.2;
>> 
>> g.
>> 
>> On 4/19/09 5:26 PM, "Sean Hayes" <Sean.Hayes@microsoft.com> wrote:
>> 
>>> Interesting you should say that, I had exactly the same thought last night.
>>> One of the original design principles was that timed text display should be 
>>> a
>>> function of time, i.e. without state. The reasoning behind having attributes
>>> non-animateable was that it might be too expensive in terms of re-flow etc,
>>> but if at each moment in time the entire tree is effectively made anew. Then
>>> this reasoning seems unsound.
>>> 
>>> So I support the motion.
>>> 
>>> The only one I have some doubts about is dynamicFlow, because it seems to
>>> operate somewhat outside the same timeline, and thus have state across time
>>> ticks. Which is also why I think dynamicFlow should be dropped, or
>>> substantially reworked in order to fit with the above model.
>>> 
>>> Sean Hayes
>>> Media Accessibility Strategist
>>> Accessibility Business Unit
>>> Microsoft
>>> 
>>> -----Original Message-----
>>> From: public-tt-request@w3.org [mailto:public-tt-request@w3.org] On Behalf 
>>> Of
>>> Glenn A. Adams
>>> Sent: 19 April 2009 7:11 AM
>>> To: Public TTWG List
>>> Subject: Re: ISSUE-58 (showBackground animateable): shouBackground should 
>>> not
>>> be animateable [DFXP 1.0]
>>> 
>>> i propose we take a different approach: make all styles animatable
>>> 
>>> i note that at present, the following are defined as being non-animatable:
>>> 
>>> tts:direction
>>> tts:displayAlign
>>> tts:dynamicFlow
>>> tts:extent
>>> tts:origin
>>> tts:overflow
>>> tts:unicodeBidi
>>> tts:writingMode
>>> 
>>> in contrast, all of the (remaining) following properties are defined as
>>> animatable:
>>> 
>>> tts:backgroundColor
>>> tts:color
>>> tts:display
>>> tts:fontFamily
>>> tts:fontSize
>>> tts:fontStyle
>>> tts:fontWeight
>>> tts:lineHeight
>>> tts:opacity
>>> tts:padding
>>> tts:showBackground
>>> tts:textAlign
>>> tts:textDecoration
>>> tts:textOutline
>>> tts:visibility
>>> tts:wrapOption
>>> tts:zIndex
>>> 
>>> there doesn't seem to be any principled reason for making any of the above
>>> properties non-animatable; in fact, we have recently assumed that tts:origin
>>> (and perhaps tts:extent) is animatable in order to move a region to a new
>>> location; also, the following seem to be inconsistent on the surface:
>>> 
>>> * tts:textAlign is animatable, but tts:displayAlign is not
>>> * tts:wrapOption is animatable, but tts:overflow is not
>>> 
>>> if one supports animation for one property, then it should be fairly trivial
>>> to support animation on any other property;
>>> 
>>> therefore, i propose we make all the style properties animatable, which will
>>> make usage and authoring less subject to special case exceptions;
>>> 
>>> glenn
>>> 
>>> On 4/18/09 3:03 AM, "Timed Text Working Group Issue Tracker"
>>> <sysbot+tracker@w3.org> wrote:
>>> 
>>>> 
>>>> ISSUE-58 (showBackground animateable): shouBackground should not be
>>>> animateable [DFXP 1.0]
>>>> 
>>>> http://www.w3.org/AudioVideo/TT/tracker/issues/58
>>>> 
>>>> Raised by: Sean Hayes
>>>> On product: DFXP 1.0
>>>> 
>>>> tts:showBackground is listed in the specification as animateable. I cant 
>>>> see
>>>> why this is necessary. Unless we have a use case for this I propose it be
>>>> set
>>>> to animateable: none
>>>> 
>>>> 
>>>> 
>>> 
>>> 
>>> 
>> 
>> 
>> 
> 
>
Received on Friday, 24 April 2009 10:21:27 UTC