RE: Annotation of Timed Text from Johnb@screen.subtitling.com on 2003-06-30 (public-tt@w3.org from June 2003)

From: <Johnb@screen.subtitling.com>
Date: Mon, 30 Jun 2003 10:19:10 +0100
To: singer@apple.com
Cc: j.hanna@snet.net, glenn@xfsi.com, public-tt@w3.org
Message-ID: <11E58A66B922D511AFB600A0244A722E094046@NTMAIL>
Hi Dave, Glenn, John et al,

Been thinking about this issue for a while..... Thought I'd drop my oar in
too!

For me, metadata has a fairly clear and limited meaning. It is information
that allows a user (or user agent) to make more informed choices about the
'content' of an element. Thus metadata is xml:lang, it is
class="SoundEffect", it is class="fred". An annotation might be "The
character fred was played by the actor Joe Bloggs, who replaced the original
cast member David Smith."

From the original discussion about storyboards and 'video/audio research',
it would seem to me that the annotations discussed are themselves examples
of timed text. Granted they do not fall into the category of dialog, or
traditional concepts of description - but they are primarily text content
fragments (accessibility issues aside) that have a greater validity during a
particular time epoch than at another. I see no reason why a TT document may
not support multiple 'timelines', each of which has text content that is
contemporary, but of a different character/type. This allows for different
content fragments (or spans) to co-exist and overlap without creating very
complex timing element hierarchies.

i.e. 

In the scenarios you mention it probably more useful (easier) to implement
the document with divisions between annotation and dialog and description at
a high level in the document tree:

Note - in the following the element/attribute names are completely arbitrary
and not intended to relate to any existing 'standard' element.

<root>
   <!-- period to encompass all material -->
   <period begin="0">
	<stream role="annotation">
		<period begin="xyz" end="xyz">
			<text>
			....
		</period>
		<period begin="xyz" end="xyz">
			<text>
		...
	</stream>
	<stream role="dialog">
		<period begin="abc" end="wer">
			<text>
			....
		</period>
		...
	</stream>
	<stream role="description">
		<period begin="123" end="345">
			<text>
			....
		</period>
		...
	</stream>
   </period>
</root>

This contrasts with a possibly more traditional perspective:

<root>
   <period begin="xyz" end="xyz">
	<text class="annotation">
	   ....
	</text>
	<text class="dialog">
	   ....
	</text>
	...   
   </period>
   <period begin="xyz" end="xyz">
	<text class="description">
	   ....
	</text>
   </period>
   <period begin="xyz" end="xyz">
	<text class="dialog">
	   ....
	</text>
   </period>
</root>

I suspect that the TT format will be used in many different ways, and
consequently I hope that it is flexible enough to allow adopters of the
format to structure their documents according to their requirements.

> but given that there wouldn't be reverse pointers, would it be
> possible to display that external annotation material simultaneously in a
timed manner?

If TT format supports XPointer as a means of identifying text content, then
both the source document and the annotation document may be TT format. This
neatly solves the reverse pointer issue, since both documents may stand
alone and if desired reference one another. If multiple documents are
preferred then I personally favour a separate annotation document that
'looks in' to the source document cf a source document that references out -
YMMV. The annotation document can include Xpointer references to the source
document - becoming a <super> document, in the sense that it may contain
both the source material (albeit by reference) and the annotation.

regards John Birch.


> -----Original Message-----
> From: Dave Singer [mailto:singer@apple.com]
> Sent: 27 June 2003 18:10
> To: John Hanna; Glenn A. Adams; public-tt@w3.org
> Subject: Re: Annotation of Timed Text
> 
> 
> 
> At 11:20 -0400 6/27/03, John Hanna wrote:
> >It seems that "annotation" and "meta-data" have quite a 
> conceptual overlap,
> >but are not the same in current usage.
> >
> >Annotations are included in XML Schema as full elements. TT 
> Requirements
> >include meta-data elements that could be used for 
> annotation, but it seems
> >that some annotations might not fit that model, for example, if an
> >annotation didn't reference a single element, but needed to 
> say "element 1
> >doesn't agree with element 7". That could be shifted to an 
> "external" form
> >of annotation document that just points to the multiple TT 
> document contents
> >elements, but given that there wouldn't be reverse pointers, 
> would it be
> >possible to display that external annotation material 
> simultaneously in a
> >timed manner?
> 
> I guess I don't see why the meta-data structure shouldn't be able to 
> say things about multiple elements either, so I'm not sure I'm seeing 
> the semantic distinction still.  Can you say more?  I'm probably 
> being dense.
> 
> >If there was some sort of dynamic formation of a combined
> >document for timed display, wouldn't the structure need to support
> >distinctions between content and annotation anyway?
> >
> >I note that MPEG-7 is mentioned in an editorial comment in 
> TT Requirements.
> >Doesn't MPEG-7 do much of what Timed Text will do? It seems 
> to address
> >annotation also. Possibly there should be more connection 
> between the two.
> 
> I think the purpose and function of MPEG-7 was intended to be very 
> different from timed text, though I do not doubt that there is 
> overlap -- and that, for example, an mpeg-7 timed meta-data document 
> which contains caption information might be convertable into a 
> timed-text document and styled for display.  But the functions are 
> quite different.
> 
> >
> >Regards,
> >John Hanna
> >
> >----- Original Message -----
> >From: "Dave Singer" <singer@apple.com>
> >Sent: Thursday, June 26, 2003 12:13 PM
> >Subject: Re: Annotation of Timed Text
> >
> >
> >>  At 12:06 -0400 6/25/03, John Hanna wrote:
> >>  >On Tuesday, June 24, 2003 4:02 PM Glenn A. Adams said...
> >>  >>  We haven't discussed annotating. Could you give some concrete
> >>  >>  use case scenarios?
> >>
> >>  Isn't "annotation" another word for "meta-data" or do you 
> see these
> >>  as distinct?  It seems that many domains may need domain-specific
> >>  data associated with the timed-text.
> >>
> >>  >
> >>  >Consider the use of digital video (including audio) for 
> research and
> >>  >education. A video is observed and interpreted from a particular
> >>  >perspective, and a Timed Text record made of the 
> interpretation. The
> >record
> >>  >could be a complex structure, akin to descriptive 
> captioning, with
> >>  >transcriptions of dialog, descriptions of movement and 
> events, etc. This
> >>  >would be displayed in a browser or media viewer panel 
> synchronized with
> >the
> >>  >video. For research, review comments about the 
> interpretation would
> >annotate
> >>  >the record and be presented in the timed display. For Education,
> >>  >supplementary hints about how to view the scenes and pick out the
> >referenced
> >>  >aspects, or even instructor notes, would annotate the record and
> >optionally
> >>  >be presented in the timed display. Of course the record 
> and annotations
> >>  >could also be viewed in a non-timed manner. To emphasize 
> the video
> >aspects
> >>  >associated with the timed text, an overlay of tracking 
> highlights could
> >mark
> >>  >video objects and be keyed (e.g., by color) to the text.
> >>  >
> >>  >Another example would be for an animated storyboard or 
> movie, with the
> >>  >screenplay presented in timed text (although there may not be a
> >screenplay
> >>  >DTD/schema yet). Annotations would pertain to hints or 
> comments for all
> >the
> >>  >different production roles, or for cinema studies.
> >>  >
> >>  >From a human factors perspective, having labels and 
> descriptions closer
> >and
> >>  >more directly associated with the video objects would be 
> best. Possibly
> >the
> >>  >timed text dialog should be displayable in a 
> talk-balloon that moves with
> >>  >the speaker. Is that within the scope of ways timed text can be
> >presented,
> >>  >or is it moving into animation?
> 
> 
> -- 
> David Singer
> Apple Computer/QuickTime
>
Received on Monday, 30 June 2003 05:12:36 UTC