- From: Bjorn Bringert <bringert@google.com>
- Date: Fri, 14 Jan 2011 11:19:40 +0000
- To: Olli@pettay.fi
- Cc: "Young, Milan" <Milan.Young@nuance.com>, public-xg-htmlspeech@w3.org
On Fri, Jan 14, 2011 at 11:05 AM, Olli Pettay <Olli.Pettay@helsinki.fi> wrote: > On 01/13/2011 07:59 PM, Young, Milan wrote: >> >> Hello Olli, >> >> I'd be interested to know what sort of use case you have in mind that >> uses default speech services. > > I was mainly thinking about rather simple speech interfaces created by > individual web developers who want to try out new features. > I would assume grammars created by them aren't that big. > We really want comments about the API from the web developers. > > Also things like speech-enabled web search should be doable, > and message (email/twitter/sms) dictation. > > And speech enabled controls in a web page. > "Go to the next article", "Read this article", etc. > > >> I have been under the impression that >> most real world apps have grammars that are either too large or too >> sensitive to be transported over the network. > > Well, there are no real world speech enabled web apps commonly available, so > we don't necessarily know what they will look like ;) > But sure, for many cases (complex dialogs, specialized search, etc.) > network engines would be need. I agree that default speech services are very important. Use cases that could be handled pretty well include search, messaging, dictation, translation, simple control. I think that the majority of speech web apps (as measured in number of apps) will use default speech services, because the developers do not have the resources to run their own speech services. I also think that this will account for the majority of the speech app usage, at least in the beginning. If simple speech web apps take off, I think that we will see a gradual increase in interest in more complex applications and in interest from organizations large enough to devote resources to running their own high-quality speech services. /Bjorn >> -----Original Message----- >> From: public-xg-htmlspeech-request@w3.org >> [mailto:public-xg-htmlspeech-request@w3.org] On Behalf Of Olli Pettay >> Sent: Thursday, January 13, 2011 7:14 AM >> To: public-xg-htmlspeech@w3.org >> Subject: Some prioritization >> >> Hi all, >> >> I may not be able to attend conference call today (if we have such). >> But anyway, I started to prioritize requirements the way I think about >> them. Or more so, I picked up lower priority requirements and >> categorized them to 3 groups. >> I don't know how we're going to prioritize requirements, but I guess it >> doesn't harm to send this kind of email so that you know what kind of >> specification proposal I'm expected to see later this year. >> >> >> ------------- >> A bit lower priority: >> FPR46. Web apps should be able to specify which voice is used for TTS. >> FPR57. Web applications must be able to request recognition based on >> previously sent audio. >> >> >> ------------- >> Low priority: >> FPR28. Speech recognition implementations should be allowed to fire >> implementation specific events. >> FPR31. User agents and speech services may agree to use alternate >> protocols for communication. >> FPR48. Web application author must be able to specify a domain specific >> statistical language model. >> FPR56. Web applications must be able to request NL interpretation based >> only on text input (no audio sent). >> >> >> ------------- >> Something perhaps for V2 specification >> These requirements can be important, but to get at least something done >> soon we could perhaps leave these out from v1 specification. >> Note, v2 specification could be developed simultaneously with v1. >> >> FPR7. Web apps should be able to request speech service different from >> default. >> ...and because of that also the following requirements >> FPR11. If the web apps specify speech services, it should be possible to >> >> specify parameters. >> FPR12. Speech services that can be specified by web apps must include >> network speech services. >> FPR27. Speech recognition implementations should be allowed to add >> implementation specific information to speech recognition results. >> FPR30. Web applications must be allowed at least one form of >> communication with a particular speech service that is supported in all >> UAs >> FPR33. There should be at least one mandatory-to-support codec that >> isn't encumbered with IP issues and has sufficient fidelity& low >> bandwidth requirements. >> FPR55. Web application must be able to encrypt communications to remote >> speech service. >> FPR58. Web application and speech services must have a means of binding >> session information to communications. >> >> >> >> >> -Olli >> >> > > > -- Bjorn Bringert Google UK Limited, Registered Office: Belgrave House, 76 Buckingham Palace Road, London, SW1W 9TQ Registered in England Number: 3977902
Received on Friday, 14 January 2011 11:21:30 UTC