W3C home > Mailing lists > Public > www-voice@w3.org > January to March 2006

The Semantic Interpretation for Speech Recognition language reaches W3C candidate recomendation stage

From: James A. Larson <jim@larson-tech.com>
Date: Fri, 13 Jan 2006 08:52:55 -0800
Message-ID: <43C7DAE7.5050802@larson-tech.com>
To: www-voice@w3.org
The Semantic Interpretation for Speech Recognition language reaches W3C 
candidate recomendation stage

Semantic Interpretation for Speech Recognition (SISR), a language used 
in conjunction with the Speech Recognition Grammar Specification (SRGS) 
to develop speech applications, has transitioned to "Proposed 
Recommendation" by the World Wide Web Consortium (W3C).  SISR is a 
procedural language based on ECMAScript is used to process the results 
returned from a speech recognition system by performing a variety of 
tasks, including the following

Normalize speech recognition results.  Users may speak any of several 
equivalent spoken words which SISR instructions convert to a single 
textual representation.  For example, the words "yes," "ja," "of 
course," "affirmative," and "sure" are all converted to the single text 
string "yes."  This enables users to speak any of several alias terms 
without having to memorize the specific words and commands.

Process complex utterances.  SISR instructions might describe advanced 
natural language processing algorithms which extract the meaning from a 
textual phrase.  For example, the spoken utterance "Hit him again" would 
be interpreted as "Hit John" based on previous statements indicating the 
user is talking about John.  SISR enables developers to specify simple 
natural language processing instructions.

Convert speech recognition results to a standard format.  Information 
from a speech utterance can be converted into a structure appropriate 
for processing by application-specific algorithms.  For example, if the 
application is a Java application, the speech recognition results can be 
converted to a Java structure.  If the application is an XML 
application, the recognition results can be expressed as an XML 
structure.  If spoken input is to be combined with input express in 
other modalities such as keyboard or pen, then results can be expressed 
as Extended Multimodal Annotation (EMMA) notation.

The "Proposed Recommendation" stage of W3C's standardization process 
means that development of the specification is complete. The  language 
now enters a testing phase in which implementations of the language are 
tested to verify that the specification can be implemented correctly. 
After this phase is completed, W3C members will vote to adopt SISR as a 
full W3C recommendation.

The specification of SISR is available at 
http://www.w3.org/TR/semantic-interpretation/

James A. Larson
Co-chair, W3C Voice Browser Working Group
Received on Friday, 13 January 2006 16:53:23 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Monday, 30 October 2006 12:49:01 GMT