builtin grammars from Michael Bodell on 2011-10-19 (public-xg-htmlspeech@w3.org from October 2011)

From: Michael Bodell <mbodell@microsoft.com>
Date: Wed, 19 Oct 2011 01:17:13 +0000
To: "public-xg-htmlspeech@w3.org" <public-xg-htmlspeech@w3.org>
Message-ID: <22CD592CCD76414085591204EB19F4E82394B062@TK5EX14MBXC263.redmond.corp.microsoft.>

We've talked in a few different calls about how to allow/extend builtin grammars.  Here's my proposal for an extensible mechanism that should support all the cases we care about.  Note while this works very well for the <reco> being linked to corresponding recoable elements, it isn't a requirement and works as a scheme in the general case as well.

The basic pattern is:

builtin:<tag name|command>[?<attributes>[&<attributes>]*]?

Where the first ? is a character and the last is a regexp 0 or 1.  The attributes are extensible but include standard HTML5 attributes as well as the flag we discussed a few weeks back that Glen suggested about filter=noOffensiveWords.
For the tag name this would look like:

builtin:input?type=text

builtin:input?type=search&filter=noOffensiveWords

// note the pattern is [0-9][A-Z]{3} but with the various [, ], {, } all url escaped
builtin:input?type=text&pattern=%5B0-9%5D%5BA-Z%5D%7B3%7D

builtin:input?type=date&max=1979-12-31

builtin:input?type=number&min=1&value=1

builtin:input?type=range&min=0&max=1&step=0.00392156863

builtin:textarea

builtin:textarea?placeholder=text+message&filter=noOffensiveWords

builtin:button?type=submit&value=press+me

We also had earlier discussed two different types of recommended default builtins that wouldn't be tied to a tag (the command in my regexp at the top):

builtin:dictation
builtin:websearch

both of which could have additional parameter values like

builtin:websearch?filter=noOffensiveWords

With respect to who implements the builtin grammars (I.e., is this part of the WebAPI and the user agent is responsible from mapping from builtin URL to some SRGS or proprietary format to submit to the speech service, or should these named builtin types be available at the service) I think there are a lot of advantages for saying the service is responsible.  Obviously for default cases where the UA and service might be one and the same this doesn't matter, but for cases where it isn't there are these advantages:


1.       These URI can be used from the middle of other SRGS grammars and expected to work.  This means you can use number, email, dictation and what not as part of a custom grammar that may have your top few commands on top of or overlaid on the builtin types.

2.       For the proprietary non-rule based grammars (necessary for things like dictation and websearch) you don't need to have the UA own that, they can be part of the evolving recognition service.  Also, you don't need to worry about the UA needing to transfer a giant statistical language model grammar representing the whole internet web search whenever you want to do a Bing or Google voice web search.

3.       The speech service likely has the right expertise to train and tune the grammars in all the different languages.  As requiring the UA to know how to build an accurate builtin grammar even for something relatively simple like number or date in all languages is unrealistic.

Received on Wednesday, 19 October 2011 01:17:45 UTC