- From: Cyril Concolato <cyril.concolato@telecom-paristech.fr>
- Date: Wed, 24 Sep 2014 17:50:23 +0200
- To: public-tt@w3.org
Le 24/09/2014 17:30, Nigel Megitt a écrit : > Sorry, forgot the links: > > EBU-TT, EBU Tech 3350: https://tech.ebu.ch/docs/tech/tech3350.pdf > <https://tech.ebu.ch/docs/tech/tech3350.pdf> > CARRIAGE OF EBU-TT-D IN ISOBMFF, EBU Tech3381: > https://tech.ebu.ch/docs/tech/tech3381.pdf > > There's no straight link to ISO/IEC14496-12:2012 as you have to buy it > from a shop :-( This is not correct. ISO/IEC14496-12:2012 as well as the first corrigendum and the 2 amendments, including the one useful for the carriage of timed text are freely available from here http://standards.iso.org/ittf/PubliclyAvailableStandards/index.html . Unfortunately, the spec for the carriage of timed text itself (14496-30:2014) is not freely available, but I'm working on making it publicly available. Cyril > > > > From: Nigel Megitt <nigel.megitt@bbc.co.uk > <mailto:nigel.megitt@bbc.co.uk>> > Date: Wednesday, 24 September 2014 17:24 > To: Glenn Adams <glenn@skynav.com <mailto:glenn@skynav.com>> > Cc: Timed Text Working Group <public-tt@w3.org <mailto:public-tt@w3.org>> > Subject: Re: Issue-270 and Issue-335 > Resent-From: <public-tt@w3.org <mailto:public-tt@w3.org>> > Resent-Date: Wednesday, 24 September 2014 17:24 > > Glenn Adams <glenn@skynav.com <mailto:glenn@skynav.com>>, Tuesday, > 23 September 2014 20:25 wrote: > > On Tue, Sep 23, 2014 at 4:15 AM, Nigel Megitt > <nigel.megitt@bbc.co.uk <mailto:nigel.megitt@bbc.co.uk>> wrote: > > Glenn Adams <glenn@skynav.com <mailto:glenn@skynav.com>>, > Monday, 22 September 2014 22:14 wrote: > > On Mon, Sep 22, 2014 at 8:38 AM, Nigel Megitt > <nigel.megitt@bbc.co.uk > <mailto:nigel.megitt@bbc.co.uk>> wrote: > > Glenn, Courtney, all, > > The edit to TTML2 ascribed to issue-270 and > issue-335 > (https://dvcs.w3.org/hg/ttml/rev/3cbc109b90bd) is > causing me some concern. I have added notes to > both those issues, and additionally I have a > number of queries to raise for discussion: > > _Concerns_ > _ > _ > 1. it appears to define an addition/subtraction > operation on SMPTE time values even if they're > discontinuous. The processing of these seems to be > undefined, so they should be disallowed, shouldn't > they? > > > I had intended to add material to deal with the > discontinuous smpte mode, but it didn't get into the > edit. Will add. > > > 2. It blurs the layers of interpretation of time > values from documents up into any external > context. For example it opens up the ambiguity > that, when a sequence of TTML documents is wrapped > e.g. in ISOBMFF, there are media time offsets > available both in TTML and in the wrapper, and > authors may be unclear whether they are intended > as independent (additive) offsets or as duplicate > offsets in which one may be considered not for > processing, i.e. metadata. > > > Since TTML doesn't know anything about external > wrapper metadata, it isn't the right place to deal > with such possible ambiguity (e.g., in different > offset values internal and external). The correct > place to deal with this is in the external spec. > > > Since those external specs already exist we should work in > sympathy with them rather than redefining what's already > there and creating confusion. Can we avoid redefining TTML > so that it invalidates external wrappers that should be > independent? > > > It depends on specifics. I need to know the exact text in an > external spec that may intersect with this feature. It may > also require that external spec to add a note to avoid > confusion. In any case, I have not seen a worked out example > of how this proposed feature would invalidate an external spec. > > > I suggest reviewing the definitions of MPEG 4, for example ISOBMFF > in ISO/IEC 14496-12:2012, which specifies a range of generic > timing constructs for aligning media in different formats, > including composition time of samples (8.6.1.3), and specific > mappings of the presentation time-line to the media time-line > using the Edit List Box as defined in 8.6.6. This is referenced by > EBU-TT-D in Tech3381 where the only additional constraint required > is to define the behaviour when the contents of a document extend > outside the sample period. > > These constructs effectively define a media timeline, so that the > only requirement on a processor is to map the time expressions in > a document to the timeline defined in the wrapper. No further > offsets are required in the document because they're in the wrapper. > > > > 3. It is actually the opposite proposal to the one > I made in Issue-335: I've added a note there and > re-opened it. > > 4. If clock time is prohibited from using media > offset because the discontinuityOffset can not be > derived in the absence of a date, then I would > certainly be happy to propose the addition of a > date value. A use case for this is when a TTML > document is created as an archive artefact by a > processor that observes some real world timed > events and converts them into TTML. > > > My reason for excluding clock mode is because it > doesn't have a related media object. > > > Ah, right. There may in fact be a related media object, > but the temporal relationship would be indirect, and > mediated by the clock rather than some other time embedded > in the media. > > > Yes, that is a better way of saying what I intended. > > > > 5. It does nothing to address the scenario where > the media time corresponding to the beginning of > the related media object is known at authoring > time, and is non-zero. This media begin time is > distinct from, and possibly earlier than, the > beginning of the contents of the TTML document. > > > I don't understand this statement, since this is > precisely what ttp:mediaOffset does: allow the > beginning of the root temporal extent to be offset > either before or after the beginning of the related > media object. > > > ttp:mediaOffset doesn't do that though: it merely allows > for times in the document to be offset prior to > processing. It doesn't extend the root temporal extent > beyond the document's contents. > > > Correct, since it isn't intended to do that. > > > Okay, so it doesn't extend the root temporal extent but it offsets > it. Looking back once more at the wording in the TTML2 draft spec > it appears to specify the period between BEGIN(media) in > TIME(document) and BEGIN(document). And nowhere in the spec is it > required that a processor perform time calculations using it. I'm > struggling to see what the utility of this is – can you explain > the use case more? > > As you've suggested there appears to be a simple relationship > between the mediaBegin that I've proposed and your mediaOffset: > > mediaOffset = BEGIN(document) - mediaBegin > > where mediaBegin is in TIME(document). > > There are a couple of limitations on this: > 1. You can only perform the calculation when the time base permits > it, i.e. excluding SMPTE discontinuous. > 2. The maximum size of mediaOffset is limited to BEGIN(document) > unless you permit mediaBegin to be negative. > > However on the positive side, mediaBegin could be used as the > starting point in your algorithm for mapping SMPTE discontinuous > markers into continuous times. Similarly, if you replace > mediaDuration with mediaEnd, then: > > mediaDuration = mediaEnd – mediaBegin > > or if mediaEnd is not specified or is indefinite then > mediaDuration resolves to indefinite as currently defined. > > And mediaEnd would be usable as the end marker for the mapping of > SMPTE discontinuous markers into continuous times. > > Since mediaBegin and mediaEnd aren't required for general > presentation processing, and appear to have no effect on any > computed time values within the document, it may be appropriate to > make them metadata rather than parameters. There's prior work > here: EBU-TT (Tech3350) [] and it's predecessor binary format STL > both support metadata ebuttm:documentStartOfProgramme as time code. > > > > I'm puzzled by this: in your ISD generation use case, if > the TTML document were untimed but you knew > ttp:mediaOffset then how would you derive the begin time > of the first ISD? > > > SMIL semantics dictates that an unspecified begin time resolve > to 0, for both par and seq parents. ttp:mediaOffset doesn't > have any role any resolving active begin/end for document > elements. It only comes into play when synchronizing document > time coordinates with media time coordinates. > > > I'm unclear from the current spec wording what exactly a > presentation processor should do with the value, even when it does > come into play. > > ttp:mediaBegin would define the begin time of the first > possible ISD without further calculation, unless you also > want to map the times into another time base. > > > But that isn't something I'm trying to do here. Indeed, I'm > saying we can't do that without changing SMIL semantics. "the > begin time of the first possible ISD without further > calculation [in the document time base]" is always 0. > > > Surely we're free to define an extra constraint in the special > case that the author has extra knowledge about the media, to set > the first possible ISD begin time to a later point. It's extremely > similar in principle to permitting: > > <body begin="100s" timeContainer="par" …> > <div begin="5s" …> … </div> > </body> > > to generate an empty ISD from 100s to 105s, which would be an > additional feature compared to now, when if there's no content > flowed into such an ISD then it would not exist. > > There are two distinct one-dimensional temporal > coordinate spaces here that are potentially related: > > * document's temporal coordinate space, call this > TIME(document) > o origin is at ORIGIN(document), which is always > ZERO (0) > o has begin time BEGIN(document) > o has explicit or implied duration of DUR(document) > o so root temporal extent is always the open > interval: > + [ 0, DUR(document) ) > * related media object's temporal coordinate space, > call this TIME(media) > o origin is at ORIGIN(media) > o has begin time BEGIN(media) > o has explicit or implied duration of DUR(media) > o so media temporal extent is always the open > interval: > + [ BEGIN(media), BEGIN(media) + DUR(media) ) > > Since TIME(media) may have a different play rate or frame > rate to TIME(document) I think we need to introduce the > concept of evaluation time of this parameter, since > conversion between the document time base and the media > time base may only be achievable by a simple addition at > one instant. > > > I agree that the play rate of TIME(media) and TIME(document) > could be different, a point mentioned in a few notes in the > current spec text: > > > Yes, the current spec text was my reference for the term play rate. > > > *6.2.1 ttp:timeBase* > > *Note:* > > When using a media time base, if that time base is paused or > scaled positively or negatively, i.e., the media play rate is > not unity, then it is expected that the presentation of > associated Timed Text content will be similarly paused, > accelerated, or decelerated, respectively. The means for > controlling an external media time base is outside the scope > of this specification. > > *Appendix N Time Expression Semantics* > > *Note:* > > The phrase /play rate/ as used below is intended to model a > (possibly variable) parameter in the document processing > context wherein the rate of playback (or interpretation) of > time may artificially dilated or narrowed, for example, when > slowing down or speeding up the rate of playback of a related > media object <#148a2b877da9f8c4_terms-related-media-object>. > Without loss of generality, the following discussion assumes a > fixed play(back) rate. In the case of variable play rates, > appropriate adjustments may need to be made to the resulting > computations. > > *Appendix N.1 Clock Time Base* > > *Note:* > > That is to say, timing is disconnected from (not necessarily > proportional to) media time when the |clock| time base is > used. For example, if the media play rate is zero (0), media > playback is suspended; however, timing coordinates will > continue to advance according to the natural progression of > clock time in direct proportion to the reference clock base. > Furthermore, if the media play rate changes during playback, > presentation timing is not affected. > > However, at present, this text basically states > (informatively) or in the smpte case assumes: > > * for clock time base, RATE(TIME(document)) is fixed as 1X > real time, independently of RATE(TIME(media)) > * for media time base, RATE(TIME(document)) = RATE(TIME(media)) > * for smpte time base, it doesn't say anything special, but > one can infer that the same interpretation applies as for > media time base (in either continuous or discontinuous modes) > > In any case, I don't want the interpretation of the proposed > parameter to depend upon differences in play rates. > > > Agreed. > > However the play rate of the media may not be known, so > I've assumed that any time base mapping must be external > to the document, and that what we need to do to ensure > that BEGIN(document) aligns with the right point in the > media's temporal coordinate space is to define a known > fixed datum in the media, in the document's time base, and > require the processor to map the temporal coordinate spaces. > > The intent of ttp:mediaOffset is to express the delta > between BEGIN(document) and BEGIN(media): > > > That's not what I expect from a parameter called > mediaOffset – I'd certainly been reading it as > ORIGIN(document) - ORIGIN(media). > > > The problem with this is that BEGIN(media) - ORIGIN(media) is > unknown and arbitrary, and, further, shouldn't affect > synchronization IMO. It certainly wouldn't affect > synchronization in clock time base, media time base, or > continuous smpte time base. However, in the case of > discontinuous smpte time base, special treatment is needed for > using/interpreting ttp:mediaOffset, the same special treatment > that is required for converting a discontinuous smpte time > base document to an ISD sequence, something I have not yet > documented in the spec, for which the basic approach I am > thinking of is as follows: > > *Convert Discontinuous SMPTE Time Base Document to Media Time > Base Document* > > (1) reset MEDIATIMER to 0; initialize MAPPINGS to empty set; > (2) simultaneously start playback of related media object at > 1X play rate and start MEDIATIMER at 1X real time; > > > As an alternative, start playback of related media object, and > start MEDIATIMER when the mediaBegin marker is observed in the > related media's timecode. This allows for material such as clock, > bars etc that are likely to be present in the media to be ignored > reliably. > > (3) when encountering a SMPTE time label in related media > object, record the current value of MEDIATIMER and save the > pair <SMPTE time label, MEDIATIMER value> in MAPPINGS; > (4) if playback is not complete, go to (3); > > > Or if the mediaEnd marker has not been observed and there's more > media remaining, go to (3). > > (5) visit each time expression T in document, performing > following steps: > (6) if T is in MAPPINGS, then rewrite T (in document) to > MAPPINGS.get(T) and continue at (5); > (7) otherwise (T has no mapping), either abort due to mapping > error or use a fallback mapping (TBD), e.g., mapping of > "closest" label that does map; > > * if ttp:mediaOffset > 0, then BEGIN(document) > temporally follows BEGIN(media) > * if ttp:mediaOffset < 0, then BEGIN(document) > temporally precedes BEGIN(media) > > Note that this definition is arbitrary: we could > invert the meaning if we wish. In any case, the > current language decodes as follows: > > Given ttp:mediaOffset = +10s, then <body begin="5s"/> > means that body starts at 15s after BEGIN(media). > > > That seems to be an offset of ORIGIN(document) - ORIGIN(media) > > > Let's work out the example using *your* interpretation of > "offset" where we choose an arbitrary BEGIN(media) of 7s in > TIME(media), */and further assuming that media and document > play rates match/*: > > Given > > (1) BEGIN(media) = 7s in TIME(media) > (2) BEGIN(body) = 5s in TIME(document) > (3) mediaOffset = 10s = *ORIGIN(document) - ORIGIN(media)* > > Yields > > BEGIN(body) in TIME(media) = ORIGIN(media) + mediaOffset + > BEGIN(body) = 0s + 10s + 5s = 15s in TIME(media), which is the > same as *BEGIN(media) + 8s* > > and, now, let's *change BEGIN(media) to another value, say > 13s*, so we end up with 0s + 10s + 5s = 15s in TIME(media), > which is the same as *BEGIN(media) + 2s* > > > Hmmm possibly we're interpreting BEGIN(media) differently. I had > thought you meant the time of start of media playback in > TIME(media) but on thinking it through more I now think you mean > the time of start of media regardless of start of playback. So > when you say 'let's change BEGIN(media) to another value' am I > right in thinking you mean 'consider another piece of media whose > BEGIN(media) is another value' rather than 'consider playing back > the same media from a different start point'? > > If you mean start of playback of the same media, then this is > exactly the right behaviour: you started the media 6s later and > the body therefore began 6s earlier, relatively. But it's a bit > contrived, since the normal workflow is to start with the media > and create the captions/subtitles, and it's common in delivery > standards for different media to start with a similar timecode > (e.g. "10:00:00" as per my example) and for the captions to start > at different times relative to that, dependent on when the > dialogue commences. > > It would be more likely that BEGIN(body) varies for different > media assets even if BEGIN(media) is the same for each of those > media assets. Anyhow… > > > However, using *my* interpretation of "offset" using the same > info, we have: > > Given > > (1) BEGIN(media) = 7s in TIME(media) > (2) BEGIN(body) = 5s in TIME(document) > (3) mediaOffset = 10s = *BEGIN(document) - BEGIN(media)* > > > Just to check I've understood step 3, you mean: > *BEGIN(document)* = BEGIN(body) in TIME(document), and > *BEGIN(media)* = BEGIN(media) in TIME(media)? > > > > Yields > > BEGIN(body) in TIME(media) = BEGIN(media) + mediaOffset + > BEGIN(body) = 7s + 10s + 5s = 22s in TIME(media), which is the > same as *BEGIN(media) + 15s* > > and, now, let's *change BEGIN(media) to another value, say > 13s*, we end up with 13s + 10s + 5s = 28s, which is the same > as *BEGIN(media) + 15s* (still) > > > This "change BEGIN(media) to another value" has given me pause for > thought. I can think of three possible meanings: > > 1. If you mean start playback at another time [and call it > BEGIN(media)], this would be highly undesirable: the consequence > of starting playback at a different place in the media is that the > timings of all the captions/subtitles move, and presumably are no > longer aligned. So whereas the first one matched up at authoring > time it is now 6s later relative to the media. > > 2. If you mean 'there's another piece of media with a different > start time in TIME(media)' then okay, you've ended up with the > same value, but what's the advantage of that? Is it important to > have this consistent across different media and documents that > have the same mediaOffset? > > 3. If you mean that a new rendition of the same media is created, > but with a different BEGIN(media) time, and this way the same TTML > document can be used to play back captions for it, without any > change in the TTML document, but with an externally specified > BEGIN(media) that may vary? In this use case it's not meaningful > for BEGIN(document) to be later than BEGIN(media) (comparing in > the same time base) so the offset is always positive or zero (or > negative or zero if you define it the other way around). In TTML1 > it's assumed to be zero. Again, I'm struggling to see the benefit > of permitting other offset values. Plus, there's no guarantee that > in creating a new rendition the same time base has been used – the > rate of playback and eventual duration may have been tweaked for > example, as would happen if a 30fps video were played back at > 25fps without changing the number of frames. There are more > unknowns than BEGIN(media). > > However I can see the benefit of /omitting/ both mediaOffset and > mediaBegin if you have some external knowledge of BEGIN(media) and > you need the same TTML document to play back against renditions > that have been given different values for BEGIN(media), e.g. > because they've been striped with different timecode or the > opening sequence has had some more material prepended to it, or > some unwanted material removed. It would be much easier to keep > the same TTML document and include the media begin/offset > information in a wrapper in this scenario. > > > So, given your interpretation, BEGIN(body) in TIME(media) is > dependent on BEGIN(media), while in my interpretation, > BEGIN(body) remains constant with respect to BEGIN(media). > When reasoning about timing as an author, I would clearly want > to use BEGIN(media) and not ORIGIN(media) as the fixed datum. > But this preference is based on using time expressions related > to BEGIN(media) as opposed to ORIGIN(media), about which see > more below. > > > Agreed, having a fixed known document time that will be related to > BEGIN(media) at playback makes sense. > > > that must be evaluated one time only, at BEGIN(document) > or 5s in the document. > > > Not if play rates match or if you use my interpretation (see > more below). > > > This is still problematic, since it's content dependent. > Consider that two videos Va and Vb both have continuous > timecode where the beginning of the programme is at 10:00:00. > > > I interpret this, for Va and Vb, as BEGIN(media) = 36000s in > TIME(media) > > > Yes, if you convert to s that's what you get. > > Va has dialogue and a corresponding TTML document Ta such > that BEGIN(Ta) = 10:01:00 > > > I interpret this as mediaOffset = -36000s and BEGIN(Ta) = > 36060s in TIME(document), which maps to BEGIN(media) + > mediaOffset + BEGIN(body) = 36060s in TIME(media) > > > How about if we define it as: > > mediaOffset = BEGIN(Va) in TIME(Ta) - BEGIN(Ta) = -60s, and > BEGIN(Ta) = 36060s in TIME(Ta). > > Mapping BEGIN(Ta) to TIME(Va) is BEGIN(Va) in TIME(Va) - > mediaOffset, which happens to be 36000 - -60 = 36060s. > If BEGIN(Va) in TIME(Va) happened to be reset to, let's say, 0, > then BEGIN(Ta) in TIME(Va) would be 0 - -60 = 60s, which is still > the expected result. > > and Vb has Tb where BEGIN(Tb)=10:05:00. > > > I interpret this as mediaOffset = -36000s and BEGIN(Ta) = > 36300s in TIME(document), which maps to BEGIN(media) + > mediaOffset + BEGIN(body) = 36300s in TIME(media) > > > As previous example, I'd have expected mediaOffset = -300s. > > I would state that the more useful parameter would be > identical in both documents, i.e. mediaBegin="10:00:00", > so that any processor can start the effective clock (e.g. > a frame counter) ticking at the same point, rather than > having to evaluate at the arbitrary point that is > BEGIN(document). > > > *The problem here is that this document is authored such that > time expressions are not related to BEGIN(media), but rather, > related to ORIGIN(media).* TTML, being based on SMIL, > basically assumes that time is expressed in relation to > BEGIN(related media object), and not ORIGIN(related media > object). This follows from how time expressions on children of > a par time container are relative to the begin time of their > parent par container, and not the origin of the time base of > their parent. > > Our different positions on this issue appear to relate to > which mode we think of as being normal. For me, the example > you describe is abnormal from a SMIL timing perspective, > whereas apparently the converse is true for you. > > > I'm not sure I agree here: I think it's more to do with where we > think the mapping between TIME(media) and TIME(document) should > occur – inside the TTML processor or externally. You seem to want > to be able to do it internally whereas I think that it can (or > maybe should) only be done externally, albeit by a media player > that also happens to contain a TTML processor. > > Since the document time is expressed on a timeline internal to > TTML/SMIL but the media time may use any other timeline, it's hard > to make any assumption about the mapping other than that we > require the media and document play rate to be identical in real > terms. > > In my mental model, time expressions in TTML stay fixed with > respect to BEGIN(media), while in yours, they apparently stay > fixed with respect to ORIGIN(media). In my model, the use of > timeOffset is independent of BEGIN(media) while in your model > it is dependent on BEGIN(media). > > > In TTML N.3: > > "|S = (countedFrames - droppedFrames + (subFrames / subFrameRate)) > / effectiveFrameRate|" > > This doesn't include any reference begin time other than the > origin. So yes, time expressions are related to ORIGIN(document) > in TTML. > > And I would agree that when mapped to TIME(media) any timings > relative to BEGIN(media) need to stay constant. But the value > mappings for achieving this aren't within our control since we > don't in general know about TIME(media). > > So, to translate between our mental models, we have: > > *From BEGIN(media) relative to ORIGIN(media) relative time > expressions:* > > add BEGIN(media) > > *From ORIGIN(media) relative to BEGIN(media) relative time > expressions:* > > subtract BEGIN(media) > > *Now, however, let's look at the situation when play rates > differ*, i.e., RATE(TIME(document)) != RATE(TIME(media)). As > an example, let's say that RATE(TIME(media)) / > RATE(TIME(document)) is 2, i.e., we run media time at twice > the rate of document time. So, going back to my earlier numbers: > > Given > > (1) RATE(TIME(media)) = 2, RATE(TIME(document)) = 1, > EPOCH(TIME(media)) = EPOCH(TIME(document)) > (2) BEGIN(media) = 7s in TIME(media), or 3.5s in TIME(real) > (3) BEGIN(body) = 5s in TIME(document), or 5s in TIME(real) > (4) mediaOffset = 10s = *ORIGIN(document) in TIME(real) - > ORIGIN(media) in TIME(real)* > > Yields > > BEGIN(body) in TIME(real) = ORIGIN(media) + mediaOffset + > BEGIN(body) = 0s + 10s + 5s = 15s in TIME(real), which is the > same as *BEGIN(media) + 11.5s in TIME(real)* > > Now, let's *change BEGIN(media) to another value, say 13s*: > > > As above, I don't know what this change of BEGIN(media) is > supposed to signify, and it seems like it might be important. > > > Given > > (1) RATE(TIME(media)) = 2, RATE(TIME(document)) = 1, > EPOCH(TIME(media)) = EPOCH(TIME(document)) > (2) BEGIN(media) = 13s in TIME(media), or 6.5s in TIME(real) > (3) BEGIN(body) = 5s in TIME(document), or 5s in TIME(real) > (4) mediaOffset = 10s = *ORIGIN(document) in TIME(real) - > ORIGIN(media) in TIME(real)* > > Yields > > BEGIN(body) in TIME(real) = ORIGIN(media) + mediaOffset + > BEGIN(body) = 0s + 10s + 5s = 15s in TIME(real), which is the > same as *BEGIN(media) + 8.5s in TIME(real)* > > > I agree it certainly would not be desirable for the mapped begin > time of the document relative to the media not to scale linearly > but be dependent on the size of the time value used, e.g. if the > difference between 90 and 100 were not equal in real terms to the > difference between 1090 and 1100. > > > However, using my interpretation of "offset" using the same > info, we have: > > Given > > (1) RATE(TIME(media)) = 2, RATE(TIME(document)) = 1, > EPOCH(TIME(media)) = EPOCH(TIME(document)) > (2) BEGIN(media) = 7s in TIME(media), or 3.5s in TIME(real) > (3) BEGIN(body) = 5s in TIME(document), or 5s in TIME(real) > (4) mediaOffset = 10s = *BEGIN(document) in TIME(real) - > BEGIN(media) in TIME(real)* > > Yields > > BEGIN(body) in TIME(real) = BEGIN(media) + mediaOffset + > BEGIN(body) = 3.5s + 10s + 5s = 18.5s in TIME(real), which is > the same as *BEGIN(media) + 15s in TIME(real)* > > Now, let's change BEGIN(media) to another value, say 13s: > > Given > > (1) RATE(TIME(media)) = 2, RATE(TIME(document)) = 1, > EPOCH(TIME(media)) = EPOCH(TIME(document)) > (2) BEGIN(media) = 13s in TIME(media), or 6.5s in TIME(real) > (3) BEGIN(body) = 5s in TIME(document), or 5s in TIME(real) > (4) mediaOffset = 10s = *BEGIN(document) in TIME(real) - > BEGIN(media) in TIME(real)* > > Yields > > BEGIN(body) in TIME(real) = BEGIN(media) + mediaOffset + > BEGIN(body) = 6.5s + 10s + 5s = 21.5s in TIME(real), which is > the same as *BEGIN(media) + 15s in TIME(real)* > > Notice that by using my interpretation of mediaOffset, > differing play rates do not affect the relationship between > BEGIN(body) and BEGIN(media), which stays constant. > > > My proposed ttp:mediaBegin would have the value "10:00:00" > > > I agree from your example that an expression of 10:00:00 > describes the delta between BEGIN(media) and ORIGIN(media), > but this is only useful in cases where time expressions are > related to ORIGIN(media) and not BEGIN(media). > > > Since I defined it in TIME(document) not in TIME(media) the time > expressions are related to the origin. > > In the mediaOffset formalism I defined, this value, i.e., > BEGIN(media) - ORIGIN(media), is of no utility. Namely, if > mediaOffset is specified as I defined it, i.e., with your > example as mediaOffset="-36000s" or mediaOffset="-10h", then > you don't have to worry about either play rate differences or > changes in actual BEGIN(media), since the result is time > expressions always related to BEGIN(media). > > > I think we're trying to achieve the same end result, i.e. that the > text appears at the right time relative to the media despite some > (as yet unstated) set of transformations. My approach also relates > time expressions to BEGIN(media), except in TIME(document) and > therefore unavoidably with reference to ORIGIN(document). Yours > goes further, including a translation into TIME(media), which I > don't believe we should do. > > in these cases, and not mix in the concept of mapping > between the document's temporal coordinates and the > related media's temporal coordinates. > > > The play rate in the document's time base is well defined > as now. It's reasonable to assume that any media playback > device knows when the related media begins and what it's > play rate is. > > Or, given ttp:mediaOffset = -5s, then <body > begin="5s"/> means that body starts at BEGIN(media). > > Given this formalism, we don't really care about > BEGIN(media) - ORIGIN(media). > > > Agreed. What we care about is BEGIN(media) in the temporal > coordinate space of the document, or in your useful > terminology, in TIME(document). > > > Now, if you are suggesting an alternative use case > where ORIGIN(document) != 0 in the TIME(document) > coordinate space, then that is something I haven't > considered, and certainly did not intend to address. > Indeed, doing so would be problematic since SMIL > timing semantics assumes that unspecified begin > defaults to 0s, and further, that 0s corresponds to > ORIGIN(document). > > > I'm not suggesting that ORIGIN(document) !=0 in > TIME(document), since that would as you say create a whole > bunch of other problems. > > > My response to such a proposed use case would probably > be: we don't support it, you don't need to do it > anyway, so don't do it. > > Note that the above considerations assume that time > base is media, or that time base is smpte continuous > mode, or that time base is smpte discontinuous mode > and that all smpte time events have been converted to > equivalent smpte continuous mode values, e.g., by > playing back a media object in 1X normal play mode and > recording the PTS time that corresponds with each > frame associated with a smpte time label. > > Just for completeness (at the expense of being > repetitious), did you also assume that the media play rate > is identical to the document's play rate, i.e. that the > only difference between TIME(media) and TIME(document) is > an additive offset? > > > See above. > > > > _Proposals_ > > _ > _ > I would propose a resolution to points 1, 2, 3 and > 5 that is to remove mediaOffset and add a > ttp:mediaBegin parameter, expressed in the same > time base as the document's ttp:timeBase > parameter. This also fits better with > ttp:mediaDuration. > > > Hmmm. I'm not inclined to make this change, because > mentally I see mediaOffset as expressing a > difference/delta/offset between two points in two > different one-dimensional coordinate spaces both > representing linear time (at 1X play rate). Calling it > mediaBegin implies in my mind BEGIN(media), i.e., the > delta between BEGIN(media) and ORIGIN(media), and not > the delta between BEGIN(document) and BEGIN(media). > > > If this is just about the name we choose for the parameter > then we're right to choose carefully, but it shouldn't > prevent us from agreeing the semantics. To my mind > mediaBegin does suggest the delta between BEGIN(document) > and BEGIN(media), both in TIME(document). Whereas to me > mediaOffset suggests the delta between ORIGIN(document) in > TIME(document) and ORIGIN(media) in TIME(??? - this is not > clear), which if I understand correctly isn't what you > intend. Or if it is what you intend it doesn't seem to be > a complete solution for the problem. > > I would additionally propose allowing dates to be > specified to use in relation to clock times to > resolve point 4, perhaps with a ttp:date > parameter, valid only when ttp:timeBase="clock". > Note that this does not resolve any time > comparison issues caused by documents whose times > cross midnight and wrap back round to a smaller > number of hours. > > > Again, I'm wondering what is the related media object? > To my recollection, ttp:timeBase="clock" was added to > TTML to handle timed text cases that don't have a > related media object. > > > It would be a media object that had also been captured > with reference to a clock. > > > > Are there other related use cases or requirements > not met by these proposals? > > Kind regards, > > Nigel > > > -- Cyril Concolato Multimedia Group / Telecom ParisTech http://concolato.wp.mines-telecom.fr/ @cconcolato
Received on Wednesday, 24 September 2014 15:50:57 UTC