- From: <Johnb@screen.subtitling.com>
- Date: Tue, 20 Jan 2004 09:55:37 -0000
- To: luke-jr@cox.net
- Cc: public-tt@w3.org
- Message-ID: <11E58A66B922D511AFB600A0244A722E9EE6F2@NTMAIL>
Hi Luke, Sorry - I should have made it clear that I work for a broadcast subtitle equipment company and that my comments were intended to represent the **basic** requirements for captions/subtitles. Are you from a 'fansub' background? Comments inline. > On Monday 19 January 2004 02:56 pm, Johnb@screen.subtitling.com wrote: > > Text content. > > > > Timing accurate to frame/field. (including synchronisation > to video frames) > > Note this is not the same as duration or offset from start - because > > captioned > > video material may be discontinuous (e.g. Ad breaks) > In this case, the video file that goes with the timed text will also be > paused. Nothing says the video/TT player cannot in reality have 3 minutes or > so at a certain point. > Not sure, but I think that SMIL may be what you want to use > for stuff like this. No... I definitely don't want to use SMIL for this. SMIL does not work well for what I want to use TT AF for. Basically you say "the video file that goes with the timed text will also be paused". In a broadcast environment this is not possible. You CANNOT stop the video..... E.g. It is coming off a server, through an MPEG encoder and up to a satellite. You just sit on the wire and watch the timecode incrementing..... An automation system tells you what the current program is. When you see a timecode that matches a subtitle in your file, then you insert it onto the wire in the appropriate format. If the timecode jumps to one outside the program (by convention programs are timecoded starting at 10:00:00:00 (HH:MM:SS:FF) and adverts are timecoded from 00:00:00:00), then you just go dumb until the program comes back. > > Basic colours (for text - 16 colour model is sufficient) > 16 colours per dialogue may be sufficient, but overall it is > unlikely to be enough. Usually when I subtitle something, I use different > colour schemes to represent who is speaking. Timed text (not just subtitles) > also would involve displaying, for example, the title of a movie or similar > which could very well need complex effects including colour fading. Broadcast subtitling also uses colour for dialogue. E.g UK Teletext. Teletext is limited to a basic colour set. Speakers are allocated a colour, but often colours are re-used if a character no longer appears in the program. Extra colours would be usefull for more creative purposes, but they are not IMHO essential. There is also an issue about how many colours could be easily distinguished from each other. Certain colours do not show well on video, and it is more difficult to distinguish quickly between colours that vary by intensity but not hue. > > Font selection (Fonts for captions are quite restricted due to resolution > > and interlace issues > CSS, at least, seems to have font classes which could fit a number of > different fonts (times, fantasy, etc). I like the CSS font mechanism, it is focussed on the intention of the author. > > Background colour and box styles <SNIP> > There are also usages where one would only want a combination of an outline, a > shadow, and/or a box. I often use a combination of custom outline and text > colours to indicate who is speaking. Broadcast subtitling is very restricted in the features available since it is rarely burnt-in. DVD subtitling is much less restricted (though tends to follow broadcast conventions). > > Italics selection (e.g. italics are used to represent lyrics, other stress > > or intonation) > Same for bold, font sizes, under/over/strikeout lines. Again in broadcast, changing font size on a line is uncommon (in most cases impossible). Underline, Overline or strikeout are never (in my experience) used. > > Typically underlining and blinking are NOT used. > But should be fairly simple to support, in case they were to > ever be desired. Oh... don't get me wrong, I'm not saying these shouldn't be part of TTAF, simply that they would not be necessary for most captioning / subtitling. > > Transparency is used for background. I have never seen reversed text (i.e. > > background solid with transparent text). > I have on occasion seen semi-transparent foregrounds and rarely (but it does > exist) seen completely transparent foregrounds. One such example of > semitransparent foregrounds can be seen in the openings of most (all?) of the > opening for the .hack//LIMINALITY anime series by Bandai. The use of transparency for foregrounds seems contrary to one aspect of a caption or subtitle, which is to remain readable :-) I guess it's not so important for the title or credits :-) > > Positioning can be quite complex. Captions can steal into the safe area... > > often non speech characters (speaker change marks '-' or music marks '#' > > will be positioned outside of the 'safe area', this gives more space for the caption text. > > Captions may be centred, left or right aligned, and this may vary from > > caption to caption often to match the speakers on-screen position. > Timed text may also be very position specific such as to overlap a visual > area, such as might be the case for a movie subtitled in another language. This is an excellent point, I have ocassionally seen this used in broadcast captioning. There is another wrinkle on this one. There are cases where subtitles (no text - just a background) are used to for censorship. By careful positioning they are used to cover 'offending' body parts! This allows a program to be broadcast 'in the clear' to for example a cable head end and for a local subtitle insertion to be used to apply the censorship patches. > > Vertical text has yet more rigorous demands, but is typically produced as > > graphics that are burnt over video. > A complete TT format should remove any need for text to be burnt into video. > If that is not to be the goal, the format would be better suited as only a > captioning/subtitling format, and not timed text. Timed text is very broad > and covers much more than just captions. TT AF format does not remove the need or desire for burnt-in subtitling. How do you retro-fit a TTAF decoder to 100 million TV sets? TTAF is not primarily a distribution format (tho it could be). In truth, I suspect TTAF may be too top heavy for captions/subtitles, though I am reserving judgement until I see the Specification draft. But this is a difficult balance to achieve. TTAF needs to be applicable to broadcast captioning / subtitling (surely a major target area), multimedia, and generic text over time (e.g. scrolling text displays, timetables, teletext magazine services etc.) > > Finally - although a distinction is (rightly) made concerning captions and > > subtitles, in terms of the system requirments for their display there is very > > little difference. Captions may use more features for display than subtitles, > > as captions carry non speech information as text and this > > may be rendered using colour or styles that would not normally be used for > > speech related text display. > Though not perhaps the technical definition of subtitles, they are usually > considered to include overlaying translations of visual items making them > much more complex than captions. Actually that is more of a definition of 'description'. According to SMPTE: subtitles are translation of dialogue captions are dialogue and sound effects in the same language (no translation) (i.e. for hearing challenged) description is text substitute for the visual items (i.e. for visually challenged) Quite where this leaves a text service that is both a translation and intended for the hearing challenged I don't know... it falls between two SMPTE stools :-) E.g. Opera. In the UK, Europe and Asia.... subtitle is used as a generic term for any text appearing over video. The term subtitle is then qualified if necessary to make it clear if the service is intended for the for hearing challenged. Description services are very rare in broadcast - though description is sometimes included in services primarily intended for the hearing challenged. Far more common (though still infrequent) is the use of Audio description. This is an additional soundtrack that may be selected for use by the visually challenged. > > One hopes that the TT AF is simple enough to not need modules or > > optional parts... > Timed text is hardly simple. There are many effects that can be applied to > text, such as fading, stretching, and dissolving. To handle any kind of > effect, there would need to be some part of the format allowing people to > define any new effects that might be used in the future. Albert Einstein had it right. "Things should be as simple as possible, but not simpler." I think you are in a way right.... core TTAF should be really simple, with an extension mechanism to handle more complex concepts. > > > This requirement only restricts the element and attribute names of the > > > TT AF to ASCII, since R100 (use of XML) already ensured that all text > > > content can be written in ASCII. So why not say explicitly > > > that this item is about element and attribute names? > > I read this as meaning that any character can be represented, but by using > > only the ASCII characters for that representation. E.g. Cyrillic characters may be edited > > into TTAF by typing in a Unicode codepoint in an ASCII form. > > However - I don't read this as meaning that this method is the only form of > > representation for characters not in the ASCII set ! > Not sure I understand this part, but I would hope it will be > simple enough to simply use UTF-8 for everything? Yes... using UTF-8 or any other valid encoding would work. The requirements simply say that if you want to write a TTAF document in ASCII, that it should still be possible to include non ASCII characters in that document (but represented presumably by escaped sequences of ASCII characters).
Received on Tuesday, 20 January 2004 04:52:54 UTC