Speech Synthesis and Recognition of Mathematical and Scientific Content

Math Working Group, Greetings.  In the new Speech API Community Group, I indicated some synthesis and recognition topics pertaining to mathematical and scientific notation (http://lists.w3.org/Archives/Public/public-speech-api/2012Apr/0004.html): EPUB3-style (http://idpf.org/epub/30/spec/epub30-contentdocs.html#sec-xhtml-ssml-attrib) SSML attributes: <math ssml:ph="..."> ... </math> SSML in <annotation-xml>:
 
<math>
<semantics>
...
<annotation-xml encoding="application/ssml+xml"> ... </annotation-xml>
</semantics>
</math> Some other related topics include referencing audio in <annotation>, interoperability with media fragment URI: <math>
<semantics>
...
<annotation encoding="audio/..." src="..." /></semantics>
</math> and speech synthesis interoperability with SMIL-based scenarios. An interesting speech synthesis feature is the automatic synthesis of mathematical and scientific content.  The MathAudio project (http://lpf-esi.fe.up.pt/~audiomath/index_en.html) illustrates processing the MathML presentation layer into Portuguese (http://lpf-esi.fe.up.pt/~hfilipe/projecto/mathml.html) (http://lpf-esi.fe.up.pt/~audiomath/links_en.html). Semantic content can additionally be of use as input for such processing and related topics include somehow extending or annotating content dictionaries with linguistic data or extending or annotating linguistic data formats with content dictionary data for extensibility in that regard. I also indicated the possibility of extending or more fully utilizing speech recognition grammar techniques (SRGS/SISR) for recognition output scenarios including XML, hypertext, and/or MathML. I wanted to apprise the Math Working Group about those new developments and to welcome discussion and any comments and suggestions about the synthesis of and recognition of speech containing mathematical and scientific formulas.    Kind regards, Adam Sobieski
 

























 		 	   		  

Received on Monday, 16 April 2012 14:15:48 UTC