- From: Dan Burnett <dburnett@voxeo.com>
- Date: Fri, 21 Oct 2011 15:38:53 -0400
- To: public-xg-htmlspeech@w3.org
Group, The minutes from yesterday's call are available at http://www.w3.org/2011/10/20-htmlspeech-minutes.html For convenience, a text version is embedded below. Thanks to Patrick Ehlen for taking the minutes. -- dan ********************************************************************************** HTML Speech Incubator Group Teleconference 20 Oct 2011 [2]Agenda [2] http://lists.w3.org/Archives/Public/public-xg-htmlspeech/2011Oct/0030.html See also: [3]IRC log [3] http://www.w3.org/2011/10/20-htmlspeech-irc Attendees Present Dan_Druta, Dan_Burnett, Michael_Bodell, Debbie_Dahl, Robert_Brown, Glen_Shires, Charles_Hemphill, Patrick_Ehlen, Milan_Young, Olli_Pettay, Salish_Sampath, Michael_Johnston Regrets Chair Dan_Burnett Scribe Patrick_Ehlen Contents * [4]Topics 1. [5]reco element 2. [6]Can extract grammar information from input fields; have a method that allows you to extract grammar from an input field? 3. [7]Method to attribute conversion 4. [8]casing 5. [9]grammar URIs with filters on them * [10]Summary of Action Items _________________________________________________________ reco element <burn> Glen's proposal that we're discussing: [11]http://lists.w3.org/Archives/Public/public-xg-htmlspeech/2011Oct /0000.html [11] http://lists.w3.org/Archives/Public/public-xg-htmlspeech/2011Oct/0000.html glen: reco element always visible; opacity not possible to avoid clickjacking ... should we allow dynamically hiding/showing reco element michael: user agents can decide what permissions models they use, and grant permissions according to UA policy charles: also important to consier handsfree cases; can't rely on touch for permissions satish: reco should automatically activate for ppl who can't touch element ?? ... there are other ways to "click" reco michael: UA could use some of these techniques to enable permissions satish: how exactly would this be implemented? michael: implement a UI idiom from the browser the user can't control that would notify the user binding tag for input field scribe: "speech IME": User agent that can speech-enable any input field charles: field-specific reco is better for accuracy michael: allowing developer to bind grammar to a specific field; increases complexity ... if developer is sophisticated to do this from an API, make a declarative element makes it more complex glen: disagree; gives a lot more flexibility and control to both developer and user charles: a lot of web developers only work w/ HTML ... not everyone can do thins in javascript, so a declarative ability is advantageous glen: keep simple things simple. if we can do something simple w/ reco tag but not UA, then there's a good reason for a reco tag <smaug> if someone says he "knows HTML but not JS", he probably doesn't know HTML either satish: how to assoc. an element w/ an input type glen: isn't it easier to have an automatic binding people can use? satish: not clear how it would work michael: need to work through list of things that are reco-able elements charles: example on website of multiple input fields each bound to a separate grammar michael will create specific examples of how binding works for different elements Can extract grammar information from input fields; have a method that allows you to extract grammar from an input field? <glen> SpeechInputRequest.addGrammarFrom(DomInputElement) <glen> Retrieves grammar from <input> tag and adds to request. michael: would UA be responsible for communicating constraints or would it be responsible for generating and sending the grammar itself? glen: should be reco service that converts into grammar ... this would be a way to extract input field specification and sent to speech engine in scriptable manner burn: Would it be possible then to change these constraints dynamically? ... how would it work? ... what happens if you do it 2x in a row? would grammar sent before get replaced by newer one? michael: should have a way to control the grammar; but how to dynamically remove and change them? burn: rename method above to "includeGrammarFrom()" ? ... would allow you not to "add" but rather to take a snapshot glen: there are other methods that cover these kinds of actions <glen> SpeechInputRequest.addGrammarFrom(DomInputElement, weight, modal) glen: makes sense to add weight and modal flags as well ... would expect api developer to be able to enable & disable grammar <glen> SpeechInputRequest.outputToElement(DomElement) <glen> Valid DomElements are <input> and <textarea> <glen> UA will automatically fill DomElement with results. This allows the UA to display continuous streaming of results, and properly handle text insertion point. <glen> Only one DomElement may be active at a time. <smaug> request.onmatch = function(e) { domElement.value = e.result; } One DOM element active at a time, since you can't stream to 2 different elements scribe: sort of like binding to an element Olli: handling of output depends on element type; how would that work? glen: UA would implement the tricky things, like where to output text, etc. <mbodell> For request.onmatch you don't want to just do domElement.value = e.result as it over writes the content in the continuous case olli: all that needs to be defined in spec glen: for insertion point, handle in a way similar to typing text olli: would need to define so many different cases. charles: another thing: UA ought to be able to use focus to enable and disable grammars assoc. with input glen: should at least work at trying to specify it, perhaps at f2f burn: after tech discussions, there will still be a lot of work on doc, so perhaps doing this at f2f is not realistic ... even if we can't fully specify it, that isn't a fail; it shows some thought in that direction satish: perhaps choose somehting simpler to start with glen: perhaps restrict to just text, date, etc., rather than covering other types? michael: can start there and see where we end up Method to attribute conversion <mbodell> see mail [12]http://lists.w3.org/Archives/Public/public-xg-htmlspeech/2011Oct /0037.html [12] http://lists.w3.org/Archives/Public/public-xg-htmlspeech/2011Oct/0037.html michael: converting input result params into attributes, but grammar and custom params are more complicated ... want some array of simple structures? robert: arrays tend to be default way of doing things in JS michael: array of structure of something like the speech grammar? Charles: JS can also utilize objects for structures Robert: array isn't strongly typed satish: don't need helpers for grammars and speech parameters? ... why have all that when you can do it with one attribute michael: so leaning toward having these structures but not the methods discussed above? casing michael: people expect all caps for objects and interfaces michael will make those changes for next week grammar URIs with filters on them Robert: it's cool if it works ... providing parameters to finite grammars is fine ... skeptical of free dictation filter, or whether could be implemented efficiently and therefore won't be used <mbodell> builtin:input?type=text&pattern=%5B0-9%5D%5BA-Z%5D%7B3%7D Robert: specifying pattern on input field ... how to merge n-gram w/ pattern? <mbodell> (which is really builtin:input?type=text&pattern=[0-9][A-Z]{3} ) Robert: easy to specify, but prob. hard to ipmlement ... will foul up probabilities in ngram model michael: such a pattern could be translated into CFG on server ... but this isn't necessarily merged w/ freee text model Robert: but pattern doesn't necessarily represent how people will speak it ... "three four a" vs. "thirty-four a", etc. ... or "three boo" Milan: so it's up to speech service to be good at handling that ... this is a new way of specifying grammars, that many existing speech services don't do now Robert: regex doesn't include any kind of normalization Michael: real question: is it legal for speech engine to ignore such hints (or patterns) and return something that has nothing to do w/ it? (would hope so) Robert: Looks great on paper, but won't be implemented Milan: Nothing stopping speech providers to offer this but reluctant to standardize at this point glen: HTML already has a lot of this stuff Milan: builtins should not be hints; they should recognize what's specified or not Robert: cool idea; but there is work missing here that should cause reluctance on including in spec <burn> s/glen: this is a new/Milan: this is a new/ Robert: how would you autmote building a CFG off this pattern? Milan: what about adopting two types: hints and grammars michael: is it legal for speech engine to return something that doesn't fit the parameter ... for a regex, if the engine returns a result that does not fit the pattern, what should happen? ... provide some user-facing interface for correction? ... nothing wrong w/ a hint that is ignored Milan: having things that need to be followed exactly, and then just hints michael: was thinking everything is a hint Milan: What if you just want a date and don't want to specify a grammar for it? glen: if speech engine isn't up to the task, that's an issue w/ service ... most engines should be smart enough to know what a date is. but do you say don't use speech if your engine can't do that? Milan: no, give error back glen: date is a special case Milan: but there are lots of those (bool, etc) <mbodell> [13]http://www.w3.org/TR/html5/the-input-element.html [13] http://www.w3.org/TR/html5/the-input-element.html glen: should we bind to every type of input element there is? automatic binding is questionable Robert: if you need to click a mic to do it, what's the point of speech? Charles: or handsfree cases Milan: Developer has very complex UI. rather than re-write from scratch, it references a library glen: take checkboxes. grammar would not be a binary, but the term bound to the box (e.g., "non-stop") <mbodell> For date, look at [14]http://www.w3.org/TR/html5/states-of-the-type-attribute.html#dat e-state [14] http://www.w3.org/TR/html5/states-of-the-type-attribute.html#date-state <mbodell> it lists that: If the element is mutable, the user agent should allow the user to change the date represented by its value, as obtained by parsing a date from it. User agents must not allow the user to set the value to a non-empty string that is not a valid date string. If the user agent provides a user interface for selecting a date, then the value must be set to a valid date string representing the user's selection. User agents should allow the user to johnston: need to keep assistive use cases in mind Charles: can use UI to highlight these things <mbodell> so for input=date we should have the same ruling where it can't set it to values that are not valid date strings IMO Charles: should be careful ruling things out in general ... what is purpose of tag name here? glen: if we decide to allow only a single type of input, then you don't need tag name. but here distinguishing what elemnt you're associating with Charles: wouldn't it be redundant to put tag name here? glen: no, complementary to binding. can use builting grammars w/ out any binding at all ... but if the reco tag is bound to an element, then you'd create those default grammars automatically & assoc using the tag Charles: how to know which builtin goes w/ which element? glen: for multiple input fields and only one reco element, then you need to specify some grammars yourself Charles: thought builtin would specify language-specific things, and binding would occur separately Robert: couple questions on protocol draft Should start w/ those next week
Received on Friday, 21 October 2011 19:39:34 UTC