RE: Proposal categories

I'm not sure about this division.  I think we'd all agree that working with other groups and existing HTML mechanisms and proposed future HTML work is a good idea.  I also think we'd all agree that we want a simple API that meets most or all of our own individual important use cases and requirements.  But like I think Olli and Milan were saying I'm not sure that automatically means only a "default speech service" nor that it necessarily should preclude enabling all of the group's "strong interest" requirements.  So I think the proposal that we're working on at Microsoft right now is one that spans all three of your divisions, and possibly some others.  I'd categorize it more as trying to focus on addressing as many of the existing requirements as possible, and certainly all of the strong interest ones.

That means that we want a sensible API that fits the HTML programming model and makes the easy, easy but still meets our varied strong group requirements such as:

* "Web application must be able to specify domain specific custom grammars" 
* "Web apps should be able to request speech service different from default"
* "It should be possible for user agents to allow hands-free speech input"
* "It should not be required to fill an input field every time there is a recognition result"
* "The API to do recognition should not introduce unneeded latency"
* "Speech recognition implementations should be allowed to add implementation specific information to speech recognition results"
* "Web applications must not be prevented from integrating input from multiple modalities"
* "The web app should be notified when the audio corresponding to a TTS <mark> element is played back"

This is possibly a good discussion topic for a call; however, I'm not sure if trying to categorize proposals before seeing actual proposals might not be putting the cart before the horse.  So maybe it make sense to wait on a call (responding to Dan's earlier email) this week until we have some more proposals submitted?

In our group timeline we have until the end of February to collect proposals.  It sounds from this email thread like Olli has a proposal of his own in mind that he's thinking about and working on, Microsoft has one that we are working on, Bjorn are you planning on iterating either of your proposals this month in light of discussion, other groups work, or the requirements prioritization?  Is anyone else planning on submitting a proposal (from the thread I couldn't tell if Milan was suggesting he was also working on a proposal)?

-----Original Message-----
From: public-xg-htmlspeech-request@w3.org [mailto:public-xg-htmlspeech-request@w3.org] On Behalf Of Bjorn Bringert
Sent: Monday, January 31, 2011 1:14 PM
To: public-xg-htmlspeech@w3.org
Subject: Proposal categories

Here are the things that I would personally like to see proposals for, in my priority order (high to low):

1. Specify simple APIs for speech recognition and speech synthesis using speech service implementations provided by the browser or platform ("default speech services" in our requirements terminology).

2. Work with other groups (e.g. RTC-Web) to add a general mechanism for audio streaming with the features needed for speech recognition.

3. Enhance existing and proposed audio playback APIs (such as HTML <audio> and the proposed JS audio APIs) to work for TTS from web-app specified network speech synthesizers.

What do you think of this division? Who is planning to submit proposals in what categories? Is anyone working a proposal that doesn't fit neatly into exactly one of these categories?

--
Bjorn Bringert
Google UK Limited, Registered Office: Belgrave House, 76 Buckingham Palace Road, London, SW1W 9TQ Registered in England Number: 3977902

Received on Wednesday, 2 February 2011 03:48:15 UTC