- From: Dominic Mazzoni <dmazzoni@google.com>
- Date: Mon, 1 Oct 2012 23:07:04 -0700
- To: Glen Shires <gshires@google.com>
- Cc: public-speech-api@w3.org
On Mon, Oct 1, 2012 at 4:14 PM, Glen Shires <gshires@google.com> wrote: > - Should markerName be changed from "DOMString" to "type ( enumerated string > ["word", "sentence", "marker"] )". I'd be fine with an enum, as long as it's clear that we have the option to expand on this in the future - for example, an engine might be able to do a callback for each phoneme or syllable. > If so, how are named markers returned? > (We could add a "DOMString namedMarker" parameter or add an "onnamedmarker() > event". I'd prefer the DOMString namedMarker over a separate event. I think one event for all progress updates is simpler for both server and client. > - What should be the format for elapsedTime? Perhaps this should be analogous to currentTime in a HTMLMediaElement? That'd be the time since speech on this utterance began, in seconds. Double-precision float. > I propose the following definition: > > SpeechSynthesisMarkerCallback parameters > > charIndex parameter > The zero-based character index into the original utterance string of the > word, sentence or marker about to be spoken. I'd word this slightly differently. In my experience, some engines support callbacks *before* a word, others support callbacks *after* a word. Practically they're almost the same thing, but not quite - the time is slightly different due to pauses, and the charIndex is either before the first letter or after the last letter in the word. My suggestion: The zero-based character index into the original utterance string that most closely approximates the current speaking position of the speech engine. No guarantee is given as to where charIndex will be with respect to word boundaries (such as at the end of the previous word or the beginning of the next word), only that all text before charIndex has already been spoken, and all text after charIndex has not yet been spoken. What do you think? Feel free to edit / refine, I'd just like to word it in such a way that we can support a wide variety of existing speech engines and that clients don't make assumptions about the callbacks that won't always be true. - Dominic
Received on Tuesday, 2 October 2012 06:07:31 UTC