- From: <Johnb@screen.subtitling.com>
- Date: Mon, 19 Jan 2004 14:56:09 -0000
- To: bert@w3.org
- Cc: public-tt@w3.org
- Message-ID: <11E58A66B922D511AFB600A0244A722E9EE6ED@NTMAIL>
Bert, (et al) I read your comments with some interest, having just re-read the requirements document myself following the recent posting of minutes on the list. I would like to add my observations to yours, in anticipation of feedback from the WG for TT AF of what is intended by certain statements in the requirements document. Bert Bos wrote: (on 15 January 2004 19:10) > Based on: > > http://www.w3.org/TR/2003/WD-tt-af-1-0-req-20030915 > > * 1.2 System model > > How about a model of the timed text itself? processing, timing, > structure. > > * S0000 > > What does captioning need, precisely? Color, fonts, font size, > indents, bullets, images, positioning, timing, font styles, > underlining, blinking, text shadow, background, transparency, > sections/groups, repeating blocks, tabulation, right alignment, > centering, vertical text, real-time authoring... > > * S001 > > Ditto This is not strictly a comment.... I work in the subtitling / captioning industry (sse sig) so am perhaps vaguely qualified to answer this...... Captioning (or subtitling) audio involves: Text content. Timing accurate to frame/field. (including synchronisation to video frames) Note this is not the same as duration or offset from start - because captioned video material may be discontinuous (e.g. Ad breaks) Basic colours (for text - 16 colour model is sufficient) Font selection (Fonts for captions are quite restricted due to resolution and interlace issues Background colour and box styles (boxing refers to how the background fits around the text) E.g. A 'stripe' is a horizontal background across entire display, 'box' is background starting just before first character on a line and ending just after last character. Other styles are possible e.g. 'word box'. The 'leading' between lines of a subtitle may or may not be filled with background. Italics selection (e.g. italics are used to represent lyrics, other stress or intonation) Typically underlining and blinking are NOT used. Outline and shadow effects on glyphs are used, but outline is typ. a black border around a coloured(filled) glyph, and serves to accentuate the glyph against a varying video background (i.e. used when background is transparent) Transparency is used for background. I have never seen reversed text (i.e. background solid with transparent text). Positioning can be quite complex. Captions can steal into the safe area... often non speech characters (speaker change marks '-' or music marks '#' will be positioned outside of the 'safe area', this gives more space for the caption text. Captions may be centred, left or right aligned, and this may vary from caption to caption often to match the speakers on-screen position. Vertical text has yet more rigorous demands, but is typically produced as graphics that are burnt over video. Finally - although a distinction is (rightly) made concerning captions and subtitles, in terms of the system requirments for their display there is very little difference. Captions may use more features for display than subtitles, as captions carry non speech information as text and this may be rendered using colour or styles that would not normally be used for speech related text display. > * S002 > Probably needs speech generation support, such as CSS audio properties > or another transformation to SSML. This is true if TTAF is used as a source for the generation of an audio output. > * S003 > > Is this also intended to be usable to do "marquee" in HTML (embedded > in an OBJECT or IMG element)? My view is that TTAF is intended as a format for the authorial intent of text over time. How TTAF is handled when it is embedded in other protocols/formats is probably outside the scope of TTAF, but I guess if it is considered a likely scenario it should have some bearing on how TTAF is specified. > * R100 > > What is meant by "authored using XSL"? Does that mean the TT AF can be > the result of a transformation from some other XML format? In that > case, why insist on XSL, why not Perl, e.g.? I read this as meaning 'The TTAF specification is written in XML / XSL. > * R101 - R103 > > One hopes that the TT AF is simple enough to not need modules or > optional parts... Hear Hear! > * R106 > > This seems to say that the TT AF should not contain functions that > serve no purpose, but it says it in a rather verbose way. > Unless I misunderstand, this seems rather obvious... > > * R110 > > What is an "idealized" streamable format? > > * R112 > > The task of the TT WG is to define a TT AF (and probably a TT > format), not to define the editor to write that format with. > (Unless you make a case why you need to do this, and probably > update the group's charter as well.) Hopefully this means that TTAF will be specified such that certain accessibility issues will be mandatory within a TTAF document rather than optional. (See R217/218) > * R204 > > This requirement only restricts the element and attribute names of the > TT AF to ASCII, since R100 (use of XML) already ensured that all text > content can be written in ASCII. So why not say explicitly > that this item is about element and attribute names? I read this as meaning that any character can be represented, but by using only the ASCII characters for that representation. E.g. Cyrillic characters may be edited into TTAF by typing in a Unicode codepoint in an ASCII form. However - I don't read this as meaning that this method is the only form of representation for characters not in the ASCII set ! > * R209 > > This makes sense, but some motivation would be good. How about > headings and lists? > * R217, R218 > > "Embedded" means "in the same file"? Such as a data URL? Or is it an > external image intended to be displayed simultaneously, while > "non-embedded" means "intended as hyperlink"? > > If the former, is it also permitted to have the TT AF and the image > together in a file of a third type, such as a "jar" file? If so, is it > OK if that third format is a generic archive format, or should it have > a MIME type that indicates that this is an archive used as TT AF > (though structurally equal to a generic format)? > * R219, R220 > > Not by inventing a new font format, I hope... > > Any idea yet whether there will be a one or more required font formats > (TrueType, SVG) or is it OK when a UA supports at least one font > format, even if it is the only UA to know that format? I personally would like to see adoption of CSS font selection. Ultimately the display font used for presentation will depend upon the UA (since all fonts may not be present at the UA). Consequently IMHO it is more important to convey the authors intent wrt the font used, than the actual font used. That said, it may be important in certain cases for TTAF instances to carry a font (or at least glyphs) as bitmaps or vectors etc for specific usages of TTAF e.g. company logos (This might use SVG?) > * R221 > > The sentence is hard to read or maybe even ambiguous. What does > "appropriate domain of discourse" mean? Is it a modifier of "text > content" or of "descriptive information"? Is the idea that > you can embed a TEI file in the TT AF? I interpreted this as 'an appropriate meta dictionary for describing what the text is' E.g. stage direction - or - dialogue etc. EIA 708 also contains such descriptive categories. > * R222 > > This sounds rather ambitious. I thought TT was a mono-media component, > to be used, e.g., inside SMIL, not a SMIL-replacement. I agree. I would personally prefer to drop audio. Audio description as source text for re-speaking (by human or machine) would still be TTAF. > * R223 > > What does "non-embedded" mean? Does it mean that there is no link to > the audio in the TT AF itself, but the link is somehow somewhere else > (such as in a style sheet)? Or, which is maybe the same thing, that > the TT AF only expresses that there is to be audio of a certain kind > (e.g., via high-level keywords, such as "alert," "warning" and > "error"), without pointing to actual sound files? > * R292, R293 > > No objection to using XLink, XML Schemas or Relax NG, but why is it a > *requirement* to use them? Why not just an intention? What breaks if > you use something else? > * R300 > > R301 seems to be a more precise statement of R300. It seems that R300 > can be removed. > * R301 > > Why do you need attributes on elements for the TT AF? Attributes seem > redundant, when you also have external styles and even physically > embedded styles. There is nothing you can do with attributes that you > cannot also do with style sheets, but style sheets can do more. > > The two reasons I can think of for allowing attributes are (1) ease of > hand authoring for quick & dirty projects (a rather weak argument) and > (2) ease of processing, since no memory is required to store style > sheets (but that doesn't hold here, because style sheets have to be > supported anyway). > > Maybe this was intended as a requirement for the TT DF instead? I think one aspect of TTAF may be that it is not primarily content for direct display, as for example HTML is. Rather TTAF is an XML standard for conveying text information, together with styling and timing that apply to that text, between clients that will manipulate that information. Consequently, there is no requirement that the ordering of any text within the TTAF document matches the ordering (temporal and or physical) of the subsequent presentation of that text content. So in TTAF you might have the following.... <doc> "This is displayed last (in time) at the bottom of the screen." "This is displayed first (in time) at the middle of the screen." "This is displayed between (in time) at the foot of the screen." </doc> If you only had style sheets it might be more? difficult to do this? I kinda think of a style sheet as something that is used to dress up a document for display - without radically altering the basic structure of the document, whereas I see style expressed as attributes as being bound more tightly to the context.... YMMV. > * R305 > > It might be good to refer to SSML and the upcoming CSS speech module, > since the aural properties of CSS2 will be deprecated (in CSS 2.1) and > there will be a new set of properties in CSS3, compatible with SSML. > They should be very similar to the old ones, but not exactly the same. > * R307 > > Not sure if I interpret this correctly. Is this like scrolling text, > like a "marquee"? I was the proponent of the temporal styling concept. I could send you the original examples and comments if you wish - they should be upstream somewhere... > * R390 > > See R301. It seems to me that hard-coded styles should be avoided > where possible and only allowed in final-form formats, like a TT DF. > (The principle of separation of structure and style is a relative > principle, but it seems to me that it should hold for the TT AF.) > * R391 > > It's a good principle to use existing names and definitions where > possible, but don't deprive yourself of the possibility to use names > that fit better with the particular model or syntax that you develop. > Bert Bos ( W 3 C ) http://www.w3.org/ > http://www.w3.org/people/bos/ W3C/ERCIM > bert@w3.org 2004 Rt des Lucioles / BP 93 > +33 (0)4 92 38 76 92 06902 Sophia Antipolis Cedex, France regards John Birch Senior Software Engineer Screen Subtitling Systems The Old Rectory, Church Lane Claydon, Ipswich, Suffolk IP6 OEQ Tel: +44 1473 831700 Fax:+44 1473 830078 www.screen.subtitling.com World Class Subtitling Solutions See us at Cabsat Dubai 8-10th February 2004 Stand No. S6-9 This message is intended only for the use of the person(s) ("the Intended Recipient") to whom it is addressed. It may contain information which is privileged and confidential within the meaning of the applicable law. Accordingly any dissemination, distribution, copying or other use of this message or any of its content by any person other than the Intended Recipient may constitute a breach of civil or criminal law and is strictly prohibited. If you are not the Intended Recipient please destroy this email and contact the sender as soon as possible. In messages of non-business nature, the views and opinions expressed are the author's own and do not necessarily reflect the views and opinions of Screen Subtitling Systems Limited. Whilst all efforts are made to safeguard Inbound and Outbound emails, we cannot guarantee that attachments are Virus-free or compatible with your systems and do not accept any liability in respect of viruses or computer problems experienced.
Received on Monday, 19 January 2004 09:59:00 UTC