- From: Al Gilman <asgilman@iamdigex.net>
- Date: Mon, 11 Aug 2003 15:10:57 -0400
- To: public-tt@w3.org
1. SSML has an interest in tokenization policy[1]. If TT-AF wants to do selection based on indexing over un-marked tokens the tokenization policies here and there should be aligned. [I think TT-AF wants to avoid this one for one round; I think when the dust settles on SSML 1.0 we will find Voice has, too.] 2. Lines in poetry should be entified, i.e. marked with appropriate container elements, and not separated with separators comparable to html:br. The Scooby theme song example is poetry, IMHO. Outside of poetry and similar domains where the line division is proper content, we do need to have some way to address the post-flow lines in the presentation space. The presentation spaces for [subtitles and captions] are confined enough so that we can't ignore these divisions (as John has said). I don't have a suggestion or precedent for how to approach this 3. Line-break-suppression is still something that may be considered content. [Keep-together is a common theme in the content-borne formatting hints in experience in the device-independent-authoring community. http://lists.w3.org/Archives/Public/www-archive/2003Jun/0036.html ] Again, XML container elements are to be preferred over separators. The infix character has been so abused in HTML practice as to beg to be superceded by markup wrappings for proper names and other compound terms that should resist breaking over lines. At 12:30 PM 2003-08-11, Johnb@screen.subtitling.com wrote: >Glenn, > ><GA> >We would have to extend XPointer to handle ranges that involve words or >lines. At present, ranges demarcate their endpoints using character >offsets or element offsets, e.g., from 2nd to 4th character, from before >element E to after element F, etc. > ><JB> >I'm not sure I see that this is a problem - since we wouldn't be using the >spans in this instance... > ><p id="p1"> >Scooby dooby doo where are you? >we've got some work to do now >Scooby dooby doo, where are you? >we need some help from you now >come on Scooby doo, I see you >pretending we've got a slither >you're not fooling me, cause I can see >the way you shake and shiver ></p> This example is poetry, not streaming prose. So the line divisions are content, not simply presentation. Appropriate content markup could be ><stanza id="s1"> ><line>Scooby dooby doo where are you?</line> ><line>we've got some work to do now</line> ><line>Scooby dooby doo, where are you?</line> ><line>we need some help from you now</line> ><line>come on Scooby doo, I see you</line> ><line>pretending we've got a slither</line> ><line>you're not fooling me, cause I can see</line> ><line>the way you shake and shiver</line> ></stanza> In that case no extensions on XPath are involved. Al [1] In answers to Last Call comments, the SSML Editor has said <quote cite="http://lists.w3.org/Archives/Public/www-voice/2003AprJun/0050.html"> [60] Rejected. This is a tokenization issue. Tokens in SSML are delimited both by white space and by SSML elements. You can write a word as two separate words and it will have a break, you can insert an SSML element, or you can use stress marks externally. For Asian languages with characters without spaces to delimit words, if you insert SSML elements it automatically creates a boundary between words. You can use a similar approach for German, e.g. with "Fussbalweltmeisterschaft". If you insert a <break> in the middle it actually splits the word, but that's probably what you wanted: Fussbal<break>weltmeisterschaft. If you wish to insert prosodic controls, that would be handled better via an external lexicon which can provide stress markers, etc. </quote> On the other hand, I couldn't find any basis for this assertion in the Last Call document itself. <quote cite="http://www.w3.org/TR/speech-synthesis/#S3.3"> The W3C Voice Browser Working Group is developing the Pronunciation Lexicon Markup Language [LEX]. The specification will address the matching process between words and lexicon entries <snip/> </quote> PS: Character offsets may prove problematical unless and until the early normalization provisions of the Character Model for the World Wide Web are generally honored in practice. http://www.w3.org/TR/charmod/#sec-Normalization But it probably makes more sense for the TT-AF using community to work on implementing early uniform normalization per the above than to invent workarounds to cure the effects of its absence. > >so presumably: > ><cue select="#xpointer(p1/range(1.0, 1.31))" use="a2" dur="1"/> ><cue select="#xpointer(p1/range(1.32, 1.61))" use="a2" dur="1"/> ><cue select="#xpointer(p1/range(1.62, 1.91))" use="a2" dur="1"/> > >and so on > > - would select lines from the paragraph. >But agreed - this is not quite as elegant as a word / line / character / >paragraph selector. > ><GA> >Doing a line selector is problematic unless it is based strictly on forced >line breaks in content, which would require the authoring system to >predetermine line breaks and which would not work well if UA or device >could change fonts or layout region. It might be possible to introduce a >form of "pseudo" selector that makes selection based on units that are not >determinable by lexical content. CSS provides such selectors. > ><JB> I'd definitely prefer to avoid hard line breaks - they would tie the >content to a specific layout. Within subtitling / captioning, a line break >has a greater significance than within 'bulk text' - since there may be an >inferred change of speaker etc..... > >I think the issues that arise when the "UA or device could change fonts or >layout region" are within what I am calling the "temporal flow " model. >That is - I see two distinct modes here: > >1) an explicit 'knife and forked' model - where content is without inline >**style** markup - but perhaps uses some form of line markup to support >selection. This mode is 'author driven' i.e. the author is making choices >based on expected delivery interfaces. > >2) a relaxed model - where content is without **any** inline markup. In >this model - the "temporal flow" attributes control how much content is in >the region and what happens if it overflows. This supports device >independence much more flexibly. So this example might produce pop on >subtitles.... > ><style> > p { display : none; color: blue; temporal-overflow: auto; add-interval: > 1s; read-interval: 3s; region-full-clear: all; region-fill-mode: all} ></style> > ><cue select="id(p1)" dur="50s"/> > ><p id="p1"> >Scooby dooby doo where are you? >we've got some work to do now >Scooby dooby doo, where are you? >we need some help from you now >come on Scooby doo, I see you >pretending we've got a slither >you're not fooling me, cause I can see >the way you shake and shiver ></p> > >regards >John Birch > >The views and opinions expressed are the author's own and do not necessarily >reflect the views and opinions of Screen Subtitling Systems Limited.
Received on Monday, 11 August 2003 15:32:45 UTC