- From: Charles Pritchard <chuck@jumis.com>
- Date: Sat, 14 Apr 2012 12:05:10 -0700
- To: Adam Sobieski <adamsobieski@hotmail.com>
- CC: hwennborg@google.com, public-speech-api@w3.org, public-speech-api-contrib@w3.org
- Message-ID: <4F89CA66.7070600@jumis.com>
Yes, the TTS interface does seem a bit lean on features. And the object name "TTS" is short compared to other standards. I do like the option of allowing SSML, but I don't know that we'll see much support of it in the short term. I've worked a little bit with these extensions which simply pass on to the OS APIs: http://code.google.com/chrome/extensions/tts.html http://code.google.com/chrome/extensions/ttsEngine.html It'd be nice to be able to specify some kind of synthesis from IPA without fully requiring SSML. -Charles On 4/14/2012 8:30 AM, Adam Sobieski wrote: > Speech API Community Group, > > Greetings. Regarding > http://dvcs.w3.org/hg/speech-api/raw-file/tip/speechapi.html, I wanted > to also provide some comments and suggestions for discussion: > (1) The interface 'TTS' can be refactored to 'SpeechSynthesis', > 'SpeechSynthesizer' or 'SpeechSynth'. > (2) The synthesis interface can include, in addition to text string > input, XML string input and document element input for HTML5 and SSML. > (3) During the synthesis of document element inputs, UA's can > process substructural elements, as they are synthesized, with options > resembling http://wam.inrialpes.fr/timesheets/docs/timeAction.html . > (4) For XML string and document element input formats, PLS references, > CSS speech styling, as well as EPUB3-style SSML-like attributes > (http://idpf.org/epub/30/spec/epub30-contentdocs.html#sec-xhtml-ssml-attrib) > can be recognized by synthesis processors. > (5) With regard to <math> elements, <annotation-xml > encoding="application/ssml+xml"> can be recognized by synthesis > processors. > (6) <input> types and speech recognition > (http://lists.w3.org/Archives/Public/public-xg-htmlspeech/2011Feb/att-0020/api-draft.html), > extending HTMLInputElement. > (7) Runtime dynamic grammars. > (8) SRGS/SISR object model. > The synthesis and recognition of speech containing mathematical and > scientific formulas are interesting topics. In the comments and > suggestions above, the synthesis of mathematical and scientific > formulas is broached and also interesting is how grammars can > be described such that speech recognition transcripts can include XML, > hypertext, or MathML mathematical and scientific notation. > > Kind regards, > > Adam Sobieski > > > From: hwennborg@google.com > > Date: Thu, 12 Apr 2012 10:30:03 +0100 > > To: public-speech-api-contrib@w3.org; public-webapps@w3.org; > public-xg-htmlspeech@w3.org > > CC: satish@google.com; gshires@google.com > > Subject: Speech API: first editor's draft posted > > > > In December, Google proposed [1] to public-webapps a Speech JavaScript > > API that subset supports the majority of the use-cases in the Speech > > Incubator Group's Final Report. This proposal provides a programmatic > > API that enables web-pages to synthesize speech output and to use > > speech recognition as an input for forms, continuous dictation and > > control. > > > > We have now posted in the Speech-API Community Group's repository, a > > slightly updated proposal [2], the differences include: > > > > - Document is now self-contained, rather than having multiple > > references to the XG Final Report. > > - Renamed SpeechReco interface to SpeechRecognition > > - Renamed interfaces and attributes beginning SpeechInput* to > > SpeechRecognition* > > - Moved EventTarget to constructor of SpeechRecognition > > - Clarified that grammars and lang are attributes of SpeechRecognition > > - Clarified that if index is greater than or equal to length, > returns null > > > > We welcome discussion and feedback on this editor's draft. Please send > > your comments to the public-speech-api-contrib@w3.org mailing list. > > > > Glen Shires > > Hans Wennborg > > > > [1] > http://lists.w3.org/Archives/Public/public-webapps/2011OctDec/1696.html > > [2] http://dvcs.w3.org/hg/speech-api/raw-file/tip/speechapi.html > >
Received on Saturday, 14 April 2012 19:05:30 UTC