- From: John Glauert <J.Glauert@sys.uea.ac.uk>
- Date: Fri, 7 Feb 2003 12:10:27 +0000
- To: <public-tt@w3.org>
I doubt if I will be adding much to the debate, which seems to be converging: * Some semantics should be built in. - Language - Is it a representation of the spoken audio? If so, "compression level" from verbatim down to short precis. Also, reading level. (Not necessarily the same thing.) - Or is it a representation of "stage directions" and other sounds? * But the format needs to be extensible. * Where there is a bitmap of text, there should also be (at least the option of) a plain text equivalent. Compression level might be handled by marking sections with a priority so that people could select the rate at which captions/subtitles are thrown at them. I would like something like this to be incorporated in the standard at a high level so we could use it with signing (see below). Equally, stage directions could be tagged as such in a single stream. I would like to add a plea to consider non-text! I realise this is strange given the nature of the group, but the following (at least) fit the pattern of captions/subtitles: * Audio description. - Could be transmitted as audio stream. - Could be transmitted as text or phoneme stream directed to a speech synthesiser. * Signing. - Could be transmitted as video stream. - Could be transmitted as notation stream for rendering in receiver. Such content can be simply represented by XML inserts into the TT format, with the timing and presentation information conforming to the agreed TT standards. (Lets use a small number of SMIL modules.) The language is (for the signing) ASL, BSL, or whatever. John -- Prof. John Glauert Tel: +44 1603 592603 eSIGN Project at UEA Fax: +44 1603 593345 School of Information Systems Home Office: +44 1603 462679 UEA Norwich, NR4 7TJ, UK mailto:J.Glauert@uea.ac.uk http://www.visicast.sys.uea.ac.uk Secretary: Rachel Buckenham mailto:rb@sys.uea.ac.uk Tel: +44 1603 593528
Received on Friday, 7 February 2003 07:30:34 UTC