- From: Olli Pettay <Olli.Pettay@helsinki.fi>
- Date: Wed, 19 Oct 2011 12:18:36 +0300
- To: Michael Bodell <mbodell@microsoft.com>
- CC: "public-xg-htmlspeech@w3.org" <public-xg-htmlspeech@w3.org>
On 10/19/2011 04:17 AM, Michael Bodell wrote: > We’ve talked in a few different calls about how to allow/extend builtin > grammars. Here’s my proposal for an extensible mechanism that should > support all the cases we care about. Note while this works very well for > the <reco> being linked to corresponding recoable elements, it isn’t a > requirement and works as a scheme in the general case as well. > > The basic pattern is: > > builtin:<tag name|command>[?<attributes>[&<attributes>]*]? > > Where the first ? is a character and the last is a regexp 0 or 1. The > attributes are extensible but include standard HTML5 attributes as well > as the flag we discussed a few weeks back that Glen suggested about > filter=noOffensiveWords. > > For the tag name this would look like: > > builtin:input?type=text > > builtin:input?type=search&filter=noOffensiveWords > > // note the pattern is [0-9][A-Z]{3} but with the various [, ], {, } all > url escaped > > builtin:input?type=text&pattern=%5B0-9%5D%5BA-Z%5D%7B3%7D So, is this really something speech engines can support? And how do they handle patterns. If pattern is "Hello", is that interpret as word 'Hello', or separate characters 'H', 'e' 'l', 'l','o' ? > > builtin:input?type=date&max=1979-12-31 > > builtin:input?type=number&min=1&value=1 > > builtin:input?type=range&min=0&max=1&step=0.00392156863 Again, how would speech engines support this? What kind of input is expected? User saying 0.2 would be no-match but 0.00784313726 would be ok? > > builtin:textarea > > builtin:textarea?placeholder=text+message&filter=noOffensiveWords What does placeholder=text+message mean? > > builtin:button?type=submit&value=press+me What does this mean? Why do we need input/textarea/button ? Why not just builtin:type=number etc. > > We also had earlier discussed two different types of recommended default > builtins that wouldn’t be tied to a tag (the command in my regexp at the > top): > > builtin:dictation > > builtin:websearch > > both of which could have additional parameter values like > > builtin:websearch?filter=noOffensiveWords > > With respect to who implements the builtin grammars (I.e., is this part > of the WebAPI and the user agent is responsible from mapping from > builtin URL to some SRGS or proprietary format to submit to the speech > service, or should these named builtin types be available at the > service) I think there are a lot of advantages for saying the service is > responsible. The Web facing part should be standard. If we say that builtin:input?type=text&pattern=%5B0-9%5D%5BA-Z%5D%7B3%7D should be supported, then it should be supported the same way in all the speech services. I mean, > Obviously for default cases where the UA and service might > be one and the same this doesn’t matter, but for cases where it isn’t > there are these advantages: > > 1.These URI can be used from the middle of other SRGS grammars and > expected to work. This means you can use number, email, dictation and > what not as part of a custom grammar that may have your top few commands > on top of or overlaid on the builtin types. > > 2.For the proprietary non-rule based grammars (necessary for things like > dictation and websearch) you don’t need to have the UA own that, they > can be part of the evolving recognition service. Also, you don’t need to > worry about the UA needing to transfer a giant statistical language > model grammar representing the whole internet web search whenever you > want to do a Bing or Google voice web search. > > 3.The speech service likely has the right expertise to train and tune > the grammars in all the different languages. As requiring the UA to know > how to build an accurate builtin grammar even for something relatively > simple like number or date in all languages is unrealistic. >
Received on Wednesday, 19 October 2011 09:19:19 UTC