- From: Robert Brown <Robert.Brown@microsoft.com>
- Date: Thu, 18 Aug 2011 18:55:28 +0000
- To: HTML Speech XG <public-xg-htmlspeech@w3.org>, "Dan Burnett (Voxeo)" <dburnett@voxeo.com>, "Milan Young (Nuance)" <Milan.Young@nuance.com>, Patrick Ehlen <pehlen@attinteractive.com>, "Michael Johnston (AT&T)" <johnston@research.att.com>
- Message-ID: <113BCF28740AF44989BE7D3F84AE18DD1B1EF206@TK5EX14MBXC114.redmond.corp.microsoft.>
We worked through the remaining TODO items, completing our outstanding list of TODOs. There are some follow-ups to do, and once done, I'll compile the 5th draft, which (hopefully) will be the one we then incorporate into the final report. ~~~ 12. TODO: What notation should be used? The Media Fragments Draft, "Temporal Dimensions" section has some potentially viable formats, such as the "wall clock" Zulu-time format. RB> Agreed to use wall-clock zulu time format. RB will add to the document ~~~ 13. TODO: does the Waveform-URI return a URI for each input stream, or are all input streams magically encoded into a single stream? RB> Agreed to use the URI list format we use elsewhere in the document. RB will add to the document ~~~ 14. TODO: does the Input-Waveform-URI cause any existing input streams to be ignored? RB> Agreed we ignore the streams in this case. RB will add to document. ~~~ 15. TODO: Write some examples of one-shot and continuous recognition, EMMA documents, partial results, vendor-extensions, grammar/rule activation/deactivation, etc. RB> We already have the first three. Patrick will write up a partial results example. Milan will look at vendor extensions and write up whatever seems appropriate there. Rob (I) will review the existing grammar examples and add to them if necessary. ~~~ 16. TODO: insert more synthesis examples RB> Rob (I) will email Marc to see if he's interested and available to do this. If not, I'll create some. ~~~ 17. DD64. API must have ability to set service-specific parameters using names that clearly identify that they are service-specific, e.g., using an "x-" prefix. Parameter values can be arbitrary Javascript objects. PE: We have custom vendor resource under 3.2.1 and vendor-listen-mode under 5.3. Presumably other custom params can be set by SET_PARAMS? MJ: Any issues pushing 'arbitrary javascript objects' over the protocol. RB: I'm uneasy declaring victory on this one. What exactly is an 'arbitrary javascript object'? If it can be serialized to something that can be conveyed with a vendor-specific header<http://tools.ietf.org/html/draft-ietf-speechsc-mrcpv2-24#section-6.2.16> then we're okay. But I'd like to be sure. RB> This depends on how the API wants to tackle it. Milan's reactivated this thread to try to get clarity: http://lists.w3.org/Archives/Public/public-xg-htmlspeech/2011Aug/0022.html ~~~ 18. FPR58. Web application and speech services must have a means of binding session information to communications.<http://www.w3.org/2005/Incubator/htmlspeech/live/requirements.html#fpr58> MJ: Need to clarify. RB: This essentially means "supports cookies". The exact requirements for this are IMHO unclear and unconvincing. With headers like user-ID, vendor-specific headers, reco-context-block, etc, and the fact that there's a websockets session that wraps all the requests, it's unclear what a session cookie is needed for. But it could be added if necessary. RB> Agreed that this is already satisfied. ~~~ Discussed the following questions raised by Milan in this thread http://lists.w3.org/Archives/Public/public-xg-htmlspeech/2011Aug/0020.html * I don't understand why we would prevent "active/inactive-grammars" from being included in SET-PARAMS. Not so much that there is a strong use case for doing this, but singling this out breaks consistency. RB> Agreed that the activate/de-activate methods are redundant, and that the function can be accomplished in the LISTEN method. It's an artifact of a previous draft where grammar state could change at any time. * Should note that the "Stream-ID" synthesis parameter is read-only. RB> Agreed * MRCP v2 uses case-insensitive parameter names. Suggest we follow suite with HTML Speech parameters. RB> Agreed RB> Also agreed that audio-codec header only applies to synthesize, and should be removed from recognize. ~~~ We also asked for feedback from browser vendors. This is captured the thread started by Dan Druta: http://lists.w3.org/Archives/Public/public-xg-htmlspeech/2011Aug/0023.html
Received on Thursday, 18 August 2011 18:56:09 UTC