Speech-enabled form filling example

Hi Everyone,

 

On the last call I mentioned that we have an example that shows
speech-enabled form filling.  This is a 60-second Flash demo at
http://www.everspeech.com/demos/FormFillingAnnotated/. 

 

A few quick notes:

.         This is only one of many possible approaches for interacting with
a speech-enabled form.

.         This demo addresses a commercial rather than a consumer
application with the focus on direct, rapid data entry with high accuracy.

.         The user's hands are busy operating equipment while entering data.

.         This demo is approaching 10 years old and has gone through several
implementation approaches.

.         This started with a JavaScript API and has morphed into a fully
declarative tag-based API.

.         Tags are used for the <input type="text"> boxes to associate the
appropriate grammars.

.         Grammars are activated/deactivated by the User Agent as focus
changes.

.         Most other elements require no modification, but it is possible to
influence vocabulary and pronunciation as needed.

.         I apologize in advance for this demo if you're a vegetarian!

 

The "Day 2 Procedure" page has a good sampling of HTML elements.  The markup
for "Kidney Fat" follows:

 

            <label for="KIDNEY_FAT_id">Kidney Fat</label>
            <evsp:grammar
src="evsp:GrmInteger?min=0;max=0;minDec=2;maxDec=2"
type="application/srgs+xml">
              <input type="text" size="6" maxlength="4"
                     name="KIDNEY_FAT" id="KIDNEY_FAT_id" value=""
                     onchange="validateField(this, true);" />
            </evsp:grammar>

 

This is obviously isomorphic to approaches that we've discussed.  The
grammar here is builtin/generated, but could be any SRGS.  The JavaScript
here is for text-input validation and could be removed given HTML5
validation mechanisms.

 

We also support the "for" attribute to relate a grammar to an <input> tag
given an ID.  The wrapping approach supports a direct association.  When
cut-and-paste is used, this avoids bugs if a developer forgets to match the
value of the "for" attribute with the new ID.  This also keeps symmetry with
the existing <label> tag so developers can always know that the "for"
attribute is optional with wrapping.

 

This is a relatively simple, yet fully useful example.  The point is that
many cases can be covered with a declarative tag.  The most recent proposals
support this approach, but there was a desire to see a use case.

 

We have more complex Web applications that are also fully declarative from a
markup perspective.   The JavaScript used to change visibility and focus for
the GUI naturally affects the current speech context.  No extra code is
needed to track a state model or keep speech in synch with the GUI.

 

Of course more can be done with a JavaScript speech API, but it gets
relatively complex quickly and is needed much less than one might think.

 

Best regards,

Charles

 

Received on Thursday, 27 October 2011 06:24:46 UTC