protocol notes 8/18/2011 from Robert Brown on 2011-08-18 (public-xg-htmlspeech@w3.org from August 2011)

From: Robert Brown <Robert.Brown@microsoft.com>
Date: Thu, 18 Aug 2011 18:55:28 +0000
To: HTML Speech XG <public-xg-htmlspeech@w3.org>, "Dan Burnett (Voxeo)" <dburnett@voxeo.com>, "Milan Young (Nuance)" <Milan.Young@nuance.com>, Patrick Ehlen <pehlen@attinteractive.com>, "Michael Johnston (AT&T)" <johnston@research.att.com>
Message-ID: <113BCF28740AF44989BE7D3F84AE18DD1B1EF206@TK5EX14MBXC114.redmond.corp.microsoft.>

We worked through the remaining TODO items, completing our outstanding list of TODOs. There are some follow-ups to do, and once done, I'll compile the 5th draft, which (hopefully) will be the one we then incorporate into the final report.

~~~
12. TODO: What notation should be used? The Media Fragments Draft, "Temporal Dimensions" section has some potentially viable formats, such as the "wall clock" Zulu-time format.

RB> Agreed to use wall-clock zulu time format. RB will add to the document

~~~
13. TODO: does the Waveform-URI return a URI for each input stream, or are all input streams magically encoded into a single stream?

RB> Agreed to use the URI list format we use elsewhere in the document. RB will add to the document

~~~
14. TODO: does the Input-Waveform-URI cause any existing input streams to be ignored?

RB> Agreed we ignore the streams in this case. RB will add to document.

~~~
15. TODO: Write some examples of one-shot and continuous recognition, EMMA documents, partial results, vendor-extensions, grammar/rule activation/deactivation, etc.

RB> We already have the first three. Patrick will write up a partial results example. Milan will look at vendor extensions and write up whatever seems appropriate there. Rob (I) will review the existing grammar examples and add to them if necessary.

~~~
16. TODO: insert more synthesis examples

RB> Rob (I) will email Marc to see if he's interested and available to do this. If not, I'll create some.

~~~
17. DD64. API must have ability to set service-specific parameters using names that clearly identify that they are service-specific, e.g., using an "x-" prefix. Parameter values can be arbitrary Javascript objects. PE: We have custom vendor resource under 3.2.1 and vendor-listen-mode under 5.3. Presumably other custom params can be set by SET_PARAMS? MJ: Any issues pushing 'arbitrary javascript objects' over the protocol. RB: I'm uneasy declaring victory on this one. What exactly is an 'arbitrary javascript object'? If it can be serialized to something that can be conveyed with a vendor-specific header<http://tools.ietf.org/html/draft-ietf-speechsc-mrcpv2-24#section-6.2.16> then we're okay. But I'd like to be sure.

RB> This depends on how the API wants to tackle it. Milan's reactivated this thread to try to get clarity: http://lists.w3.org/Archives/Public/public-xg-htmlspeech/2011Aug/0022.html

~~~
18. FPR58. Web application and speech services must have a means of binding session information to communications.<http://www.w3.org/2005/Incubator/htmlspeech/live/requirements.html#fpr58> MJ: Need to clarify. RB: This essentially means "supports cookies". The exact requirements for this are IMHO unclear and unconvincing. With headers like user-ID, vendor-specific headers, reco-context-block, etc, and the fact that there's a websockets session that wraps all the requests, it's unclear what a session cookie is needed for. But it could be added if necessary.

RB> Agreed that this is already satisfied.

~~~

Discussed the following questions raised by Milan in this thread http://lists.w3.org/Archives/Public/public-xg-htmlspeech/2011Aug/0020.html

* I don't understand why we would prevent
"active/inactive-grammars" from being included in SET-PARAMS. Not so
much that there is a strong use case for doing this, but singling this
out breaks consistency.

RB> Agreed that the activate/de-activate methods are redundant, and that the function can be accomplished in the LISTEN method. It's an artifact of a previous draft where grammar state could change at any time.

* Should note that the "Stream-ID" synthesis parameter is read-only.

RB> Agreed

* MRCP v2 uses case-insensitive parameter names. Suggest we follow suite with HTML Speech parameters.

RB> Agreed

RB> Also agreed that audio-codec header only applies to synthesize, and should be removed from recognize.

~~~
We also asked for feedback from browser vendors. This is captured the thread started by Dan Druta: http://lists.w3.org/Archives/Public/public-xg-htmlspeech/2011Aug/0023.html

Received on Thursday, 18 August 2011 18:56:09 UTC