- From: Adam Sobieski <adamsobieski@hotmail.com>
- Date: Tue, 17 Apr 2012 17:14:40 +0000
- To: <chuck@jumis.com>
- CC: <public-speech-api@w3.org>, <public-speech-api-contrib@w3.org>, <hwennborg@google.com>
- Message-ID: <SNT138-W344C3322830A082BFC8862C53F0@phx.gbl>
Speech API Community Group,
With regard to XML output (e.g. hypertext, RDFa, MathML) and speech recognition output, techniques include the use of SISR (http://www.w3.org/TR/semantic-interpretation/#SI7, http://www.w3.org/TR/semantic-interpretation/#SI7.1, http://www.w3.org/TR/semantic-interpretation/#SI7.2, http://www.w3.org/TR/semantic-interpretation/#SI7.3), NLSML (http://www.w3.org/TR/nl-spec/) or EMMA (http://www.w3.org/TR/emma/).
Kind regards,
AdamDate: Sat, 14 Apr 2012 12:05:10 -0700
From: chuck@jumis.com
To: adamsobieski@hotmail.com
CC: hwennborg@google.com; public-speech-api@w3.org; public-speech-api-contrib@w3.org
Subject: Re: Speech API: first editor's draft posted
Yes, the TTS interface does seem a bit lean on features.
And the object name "TTS" is short compared to other standards.
I do like the option of allowing SSML, but I don't know that we'll
see much support of it in the short term.
I've worked a little bit with these extensions which simply pass on
to the OS APIs:
http://code.google.com/chrome/extensions/tts.html
http://code.google.com/chrome/extensions/ttsEngine.html
It'd be nice to be able to specify some kind of synthesis from IPA
without fully requiring SSML.
-Charles
On 4/14/2012 8:30 AM, Adam Sobieski wrote:
Speech API Community Group,
Greetings. Regarding http://dvcs.w3.org/hg/speech-api/raw-file/tip/speechapi.html, I
wanted to also provide some comments and suggestions for
discussion:
(1) The interface 'TTS' can be refactored
to 'SpeechSynthesis', 'SpeechSynthesizer' or 'SpeechSynth'.
(2) The synthesis interface can include, in
addition to text string input, XML string input and document
element input for HTML5 and SSML.
(3) During the synthesis of document element
inputs, UA's can process substructural elements, as they are
synthesized, with options resembling http://wam.inrialpes.fr/timesheets/docs/timeAction.html
.
(4) For XML string and document element input
formats, PLS references, CSS speech styling, as well as
EPUB3-style SSML-like attributes (http://idpf.org/epub/30/spec/epub30-contentdocs.html#sec-xhtml-ssml-attrib)
can be recognized by synthesis processors.
(5) With regard to <math> elements,
<annotation-xml encoding="application/ssml+xml"> can be
recognized by synthesis processors.
(6) <input> types and speech recognition (http://lists.w3.org/Archives/Public/public-xg-htmlspeech/2011Feb/att-0020/api-draft.html),
extending HTMLInputElement.
(7) Runtime dynamic grammars.
(8) SRGS/SISR object model.
The synthesis and recognition of speech
containing mathematical and scientific formulas are
interesting topics. In the comments and suggestions above,
the synthesis of mathematical and scientific formulas is
broached and also interesting is how grammars can be described
such that speech recognition transcripts can include XML,
hypertext, or MathML mathematical and scientific notation.
Kind regards,
Adam Sobieski
> From: hwennborg@google.com
> Date: Thu, 12 Apr 2012 10:30:03 +0100
> To: public-speech-api-contrib@w3.org;
public-webapps@w3.org; public-xg-htmlspeech@w3.org
> CC: satish@google.com; gshires@google.com
> Subject: Speech API: first editor's draft posted
>
> In December, Google proposed [1] to public-webapps a
Speech JavaScript
> API that subset supports the majority of the use-cases in
the Speech
> Incubator Group's Final Report. This proposal provides a
programmatic
> API that enables web-pages to synthesize speech output
and to use
> speech recognition as an input for forms, continuous
dictation and
> control.
>
> We have now posted in the Speech-API Community Group's
repository, a
> slightly updated proposal [2], the differences include:
>
> - Document is now self-contained, rather than having
multiple
> references to the XG Final Report.
> - Renamed SpeechReco interface to SpeechRecognition
> - Renamed interfaces and attributes beginning
SpeechInput* to
> SpeechRecognition*
> - Moved EventTarget to constructor of SpeechRecognition
> - Clarified that grammars and lang are attributes of
SpeechRecognition
> - Clarified that if index is greater than or equal to
length, returns null
>
> We welcome discussion and feedback on this editor's
draft. Please send
> your comments to the public-speech-api-contrib@w3.org
mailing list.
>
> Glen Shires
> Hans Wennborg
>
> [1]
http://lists.w3.org/Archives/Public/public-webapps/2011OctDec/1696.html
> [2]
http://dvcs.w3.org/hg/speech-api/raw-file/tip/speechapi.html
>
Received on Tuesday, 17 April 2012 17:15:16 UTC