[minutes] 20 October 2011 from Dan Burnett on 2011-10-21 (public-xg-htmlspeech@w3.org from October 2011)

From: Dan Burnett <dburnett@voxeo.com>
Date: Fri, 21 Oct 2011 15:38:53 -0400
To: public-xg-htmlspeech@w3.org
Message-Id: <9C43BCD5-FA67-463F-94C6-13F03E8A00A9@voxeo.com>
Group,

The minutes from yesterday's call are available at http://www.w3.org/2011/10/20-htmlspeech-minutes.html

For convenience, a text version is embedded below.

Thanks to Patrick Ehlen for taking the minutes.

-- dan

**********************************************************************************
              HTML Speech Incubator Group Teleconference

20 Oct 2011

   [2]Agenda

      [2] http://lists.w3.org/Archives/Public/public-xg-htmlspeech/2011Oct/0030.html

   See also: [3]IRC log

      [3] http://www.w3.org/2011/10/20-htmlspeech-irc

Attendees

   Present
          Dan_Druta, Dan_Burnett, Michael_Bodell, Debbie_Dahl,
          Robert_Brown, Glen_Shires, Charles_Hemphill, Patrick_Ehlen,
          Milan_Young, Olli_Pettay, Salish_Sampath, Michael_Johnston

   Regrets
   Chair
          Dan_Burnett

   Scribe
          Patrick_Ehlen

Contents

     * [4]Topics
         1. [5]reco element
         2. [6]Can extract grammar information from input fields; have
            a method that allows you to extract grammar from an input
            field?
         3. [7]Method to attribute conversion
         4. [8]casing
         5. [9]grammar URIs with filters on them
     * [10]Summary of Action Items
     _________________________________________________________

 reco element

   <burn> Glen's proposal that we're discussing:
   [11]http://lists.w3.org/Archives/Public/public-xg-htmlspeech/2011Oct
   /0000.html

     [11] http://lists.w3.org/Archives/Public/public-xg-htmlspeech/2011Oct/0000.html

   glen: reco element always visible; opacity not possible to avoid
   clickjacking
   ... should we allow dynamically hiding/showing reco element

   michael: user agents can decide what permissions models they use,
   and grant permissions according to UA policy

   charles: also important to consier handsfree cases; can't rely on
   touch for permissions

   satish: reco should automatically activate for ppl who can't touch
   element ??
   ... there are other ways to "click" reco

   michael: UA could use some of these techniques to enable permissions

   satish: how exactly would this be implemented?

   michael: implement a UI idiom from the browser the user can't
   control that would notify the user

   binding tag for input field

   scribe: "speech IME": User agent that can speech-enable any input
   field

   charles: field-specific reco is better for accuracy

   michael: allowing developer to bind grammar to a specific field;
   increases complexity
   ... if developer is sophisticated to do this from an API, make a
   declarative element makes it more complex

   glen: disagree; gives a lot more flexibility and control to both
   developer and user

   charles: a lot of web developers only work w/ HTML
   ... not everyone can do thins in javascript, so a declarative
   ability is advantageous

   glen: keep simple things simple. if we can do something simple w/
   reco tag but not UA, then there's a good reason for a reco tag

   <smaug> if someone says he "knows HTML but not JS", he probably
   doesn't know HTML either

   satish: how to assoc. an element w/ an input type

   glen: isn't it easier to have an automatic binding people can use?

   satish: not clear how it would work

   michael: need to work through list of things that are reco-able
   elements

   charles: example on website of multiple input fields each bound to a
   separate grammar

   michael will create specific examples of how binding works for
   different elements

Can extract grammar information from input fields; have a method that
allows you to extract grammar from an input field?

   <glen> SpeechInputRequest.addGrammarFrom(DomInputElement)

   <glen> Retrieves grammar from <input> tag and adds to request.

   michael: would UA be responsible for communicating constraints or
   would it be responsible for generating and sending the grammar
   itself?

   glen: should be reco service that converts into grammar
   ... this would be a way to extract input field specification and
   sent to speech engine in scriptable manner

   burn: Would it be possible then to change these constraints
   dynamically?
   ... how would it work?
   ... what happens if you do it 2x in a row? would grammar sent before
   get replaced by newer one?

   michael: should have a way to control the grammar; but how to
   dynamically remove and change them?

   burn: rename method above to "includeGrammarFrom()" ?
   ... would allow you not to "add" but rather to take a snapshot

   glen: there are other methods that cover these kinds of actions

   <glen> SpeechInputRequest.addGrammarFrom(DomInputElement, weight,
   modal)

   glen: makes sense to add weight and modal flags as well
   ... would expect api developer to be able to enable & disable
   grammar

   <glen> SpeechInputRequest.outputToElement(DomElement)

   <glen> Valid DomElements are <input> and <textarea>

   <glen> UA will automatically fill DomElement with results. This
   allows the UA to display continuous streaming of results, and
   properly handle text insertion point.

   <glen> Only one DomElement may be active at a time.

   <smaug> request.onmatch = function(e) { domElement.value = e.result;
   }

   One DOM element active at a time, since you can't stream to 2
   different elements

   scribe: sort of like binding to an element

   Olli: handling of output depends on element type; how would that
   work?

   glen: UA would implement the tricky things, like where to output
   text, etc.

   <mbodell> For request.onmatch you don't want to just do
   domElement.value = e.result as it over writes the content in the
   continuous case

   olli: all that needs to be defined in spec

   glen: for insertion point, handle in a way similar to typing text

   olli: would need to define so many different cases.

   charles: another thing: UA ought to be able to use focus to enable
   and disable grammars assoc. with input

   glen: should at least work at trying to specify it, perhaps at f2f

   burn: after tech discussions, there will still be a lot of work on
   doc, so perhaps doing this at f2f is not realistic
   ... even if we can't fully specify it, that isn't a fail; it shows
   some thought in that direction

   satish: perhaps choose somehting simpler to start with

   glen: perhaps restrict to just text, date, etc., rather than
   covering other types?

   michael: can start there and see where we end up

Method to attribute conversion

   <mbodell> see mail
   [12]http://lists.w3.org/Archives/Public/public-xg-htmlspeech/2011Oct
   /0037.html

     [12] http://lists.w3.org/Archives/Public/public-xg-htmlspeech/2011Oct/0037.html

   michael: converting input result params into attributes, but grammar
   and custom params are more complicated
   ... want some array of simple structures?

   robert: arrays tend to be default way of doing things in JS

   michael: array of structure of something like the speech grammar?

   Charles: JS can also utilize objects for structures

   Robert: array isn't strongly typed

   satish: don't need helpers for grammars and speech parameters?
   ... why have all that when you can do it with one attribute

   michael: so leaning toward having these structures but not the
   methods discussed above?

casing

   michael: people expect all caps for objects and interfaces

   michael will make those changes for next week

grammar URIs with filters on them

   Robert: it's cool if it works
   ... providing parameters to finite grammars is fine
   ... skeptical of free dictation filter, or whether could be
   implemented efficiently and therefore won't be used

   <mbodell> builtin:input?type=text&pattern=%5B0-9%5D%5BA-Z%5D%7B3%7D

   Robert: specifying pattern on input field
   ... how to merge n-gram w/ pattern?

   <mbodell> (which is really
   builtin:input?type=text&pattern=[0-9][A-Z]{3} )

   Robert: easy to specify, but prob. hard to ipmlement
   ... will foul up probabilities in ngram model

   michael: such a pattern could be translated into CFG on server
   ... but this isn't necessarily merged w/ freee text model

   Robert: but pattern doesn't necessarily represent how people will
   speak it
   ... "three four a" vs. "thirty-four a", etc.
   ... or "three boo"

   Milan: so it's up to speech service to be good at handling that
   ... this is a new way of specifying grammars, that many existing
   speech services don't do now

   Robert: regex doesn't include any kind of normalization

   Michael: real question: is it legal for speech engine to ignore such
   hints (or patterns) and return something that has nothing to do w/
   it? (would hope so)

   Robert: Looks great on paper, but won't be implemented

   Milan: Nothing stopping speech providers to offer this but reluctant
   to standardize at this point

   glen: HTML already has a lot of this stuff

   Milan: builtins should not be hints; they should recognize what's
   specified or not

   Robert: cool idea; but there is work missing here that should cause
   reluctance on including in spec

   <burn> s/glen: this is a new/Milan: this is a new/

   Robert: how would you autmote building a CFG off this pattern?

   Milan: what about adopting two types: hints and grammars

   michael: is it legal for speech engine to return something that
   doesn't fit the parameter
   ... for a regex, if the engine returns a result that does not fit
   the pattern, what should happen?
   ... provide some user-facing interface for correction?
   ... nothing wrong w/ a hint that is ignored

   Milan: having things that need to be followed exactly, and then just
   hints

   michael: was thinking everything is a hint

   Milan: What if you just want a date and don't want to specify a
   grammar for it?

   glen: if speech engine isn't up to the task, that's an issue w/
   service
   ... most engines should be smart enough to know what a date is. but
   do you say don't use speech if your engine can't do that?

   Milan: no, give error back

   glen: date is a special case

   Milan: but there are lots of those (bool, etc)

   <mbodell> [13]http://www.w3.org/TR/html5/the-input-element.html

     [13] http://www.w3.org/TR/html5/the-input-element.html

   glen: should we bind to every type of input element there is?
   automatic binding is questionable

   Robert: if you need to click a mic to do it, what's the point of
   speech?

   Charles: or handsfree cases

   Milan: Developer has very complex UI. rather than re-write from
   scratch, it references a library

   glen: take checkboxes. grammar would not be a binary, but the term
   bound to the box (e.g., "non-stop")

   <mbodell> For date, look at
   [14]http://www.w3.org/TR/html5/states-of-the-type-attribute.html#dat
   e-state

     [14] http://www.w3.org/TR/html5/states-of-the-type-attribute.html#date-state

   <mbodell> it lists that: If the element is mutable, the user agent
   should allow the user to change the date represented by its value,
   as obtained by parsing a date from it. User agents must not allow
   the user to set the value to a non-empty string that is not a valid
   date string. If the user agent provides a user interface for
   selecting a date, then the value must be set to a valid date string
   representing the user's selection. User agents should allow the user
   to

   johnston: need to keep assistive use cases in mind

   Charles: can use UI to highlight these things

   <mbodell> so for input=date we should have the same ruling where it
   can't set it to values that are not valid date strings IMO

   Charles: should be careful ruling things out in general
   ... what is purpose of tag name here?

   glen: if we decide to allow only a single type of input, then you
   don't need tag name. but here distinguishing what elemnt you're
   associating with

   Charles: wouldn't it be redundant to put tag name here?

   glen: no, complementary to binding. can use builting grammars w/ out
   any binding at all
   ... but if the reco tag is bound to an element, then you'd create
   those default grammars automatically & assoc using the tag

   Charles: how to know which builtin goes w/ which element?

   glen: for multiple input fields and only one reco element, then you
   need to specify some grammars yourself

   Charles: thought builtin would specify language-specific things, and
   binding would occur separately

   Robert: couple questions on protocol draft

   Should start w/ those next  week
Received on Friday, 21 October 2011 19:39:34 UTC