W3C home > Mailing lists > Public > public-speech-api@w3.org > April 2012

Re: Speech API: first editor's draft posted

From: Charles Pritchard <chuck@jumis.com>
Date: Sat, 14 Apr 2012 12:05:10 -0700
Message-ID: <4F89CA66.7070600@jumis.com>
To: Adam Sobieski <adamsobieski@hotmail.com>
CC: hwennborg@google.com, public-speech-api@w3.org, public-speech-api-contrib@w3.org
Yes, the TTS interface does seem a bit lean on features.
And the object name "TTS" is short compared to other standards.

I do like the option of allowing SSML, but I don't know that we'll see 
much support of it in the short term.

I've worked a little bit with these extensions which simply pass on to 
the OS APIs:
http://code.google.com/chrome/extensions/tts.html
http://code.google.com/chrome/extensions/ttsEngine.html

It'd be nice to be able to specify some kind of synthesis from IPA 
without fully requiring SSML.

-Charles

On 4/14/2012 8:30 AM, Adam Sobieski wrote:
> Speech API Community Group,
>
> Greetings.  Regarding 
> http://dvcs.w3.org/hg/speech-api/raw-file/tip/speechapi.html, I wanted 
> to also provide some comments and suggestions for discussion:
> (1) The interface 'TTS' can be refactored to 'SpeechSynthesis', 
> 'SpeechSynthesizer' or 'SpeechSynth'.
> (2) The synthesis interface can include, in addition to text string 
> input, XML string input and document element input for HTML5 and SSML.
> (3) During the synthesis of document element inputs, UA's can 
> process substructural elements, as they are synthesized, with options 
> resembling http://wam.inrialpes.fr/timesheets/docs/timeAction.html .
> (4) For XML string and document element input formats, PLS references, 
> CSS speech styling, as well as EPUB3-style SSML-like attributes 
> (http://idpf.org/epub/30/spec/epub30-contentdocs.html#sec-xhtml-ssml-attrib) 
> can be recognized by synthesis processors.
> (5) With regard to <math> elements, <annotation-xml 
> encoding="application/ssml+xml"> can be recognized by synthesis 
> processors.
> (6) <input> types and speech recognition 
> (http://lists.w3.org/Archives/Public/public-xg-htmlspeech/2011Feb/att-0020/api-draft.html), 
> extending HTMLInputElement.
> (7) Runtime dynamic grammars.
> (8) SRGS/SISR object model.
> The synthesis and recognition of speech containing mathematical and 
> scientific formulas are interesting topics.  In the comments and 
> suggestions above, the synthesis of mathematical and scientific 
> formulas is broached and also interesting is how grammars can 
> be described such that speech recognition transcripts can include XML, 
> hypertext, or MathML mathematical and scientific notation.
>
> Kind regards,
>
> Adam Sobieski
>
> > From: hwennborg@google.com
> > Date: Thu, 12 Apr 2012 10:30:03 +0100
> > To: public-speech-api-contrib@w3.org; public-webapps@w3.org; 
> public-xg-htmlspeech@w3.org
> > CC: satish@google.com; gshires@google.com
> > Subject: Speech API: first editor's draft posted
> >
> > In December, Google proposed [1] to public-webapps a Speech JavaScript
> > API that subset supports the majority of the use-cases in the Speech
> > Incubator Group's Final Report. This proposal provides a programmatic
> > API that enables web-pages to synthesize speech output and to use
> > speech recognition as an input for forms, continuous dictation and
> > control.
> >
> > We have now posted in the Speech-API Community Group's repository, a
> > slightly updated proposal [2], the differences include:
> >
> > - Document is now self-contained, rather than having multiple
> > references to the XG Final Report.
> > - Renamed SpeechReco interface to SpeechRecognition
> > - Renamed interfaces and attributes beginning SpeechInput* to
> > SpeechRecognition*
> > - Moved EventTarget to constructor of SpeechRecognition
> > - Clarified that grammars and lang are attributes of SpeechRecognition
> > - Clarified that if index is greater than or equal to length, 
> returns null
> >
> > We welcome discussion and feedback on this editor's draft. Please send
> > your comments to the public-speech-api-contrib@w3.org mailing list.
> >
> > Glen Shires
> > Hans Wennborg
> >
> > [1] 
> http://lists.w3.org/Archives/Public/public-webapps/2011OctDec/1696.html
> > [2] http://dvcs.w3.org/hg/speech-api/raw-file/tip/speechapi.html
> >
Received on Saturday, 14 April 2012 19:05:31 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Saturday, 14 April 2012 19:05:31 GMT