W3C home > Mailing lists > Public > public-xg-htmlspeech@w3.org > October 2011

[minutes] 13 October 2011

From: Dan Burnett <dburnett@voxeo.com>
Date: Fri, 14 Oct 2011 11:35:38 -0400
Message-Id: <217B9035-A2AE-4230-9384-EBBC591371ED@voxeo.com>
To: public-xg-htmlspeech@w3.org
Group,

The minutes from yesterday's call are available at http://www.w3.org/2011/10/13-htmlspeech-minutes.html

For convenience, a text version is embedded below.

Thanks to Charles Hemphill for taking the minutes.

-- dan

**********************************************************************************
              HTML Speech Incubator Group Teleconference

13 Oct 2011

   [2]Agenda

      [2] http://lists.w3.org/Archives/Public/public-xg-htmlspeech/2011Oct/0016.html

   See also: [3]IRC log

      [3] http://www.w3.org/2011/10/13-htmlspeech-irc

Attendees

   Present
          Dan_Burnett, Michael_Bodell, Olli_Pettay, Dan_Druta,
          Debbie_Dahl, Milan_Young, Satish_Sampath, Charles_Hemphill,
          Michael_Johnston, Robert_Brown, Glen_Shires

   Regrets
   Chair
          Dan_Burnett

   Scribe
          Charles_Hemphill

Contents

     * [4]Topics
         1. [5]Quick API review, particularly the continuous case
         2. [6]continuous recognition -- alternates
         3. [7]Glen's proposal.
     _________________________________________________________

Quick API review, particularly the continuous case

   Michael: sent out updated API
   ... some editorial tasks remain

   <burn> Updated API email is at
   [10]http://lists.w3.org/Archives/Public/public-xg-htmlspeech/2011Oct
   /0017.html

     [10] http://lists.w3.org/Archives/Public/public-xg-htmlspeech/2011Oct/0017.html

   Michael: didn't renumber yet to avoid confusion.

   <burn> Updated API document is at
   [11]http://lists.w3.org/Archives/Public/public-xg-htmlspeech/2011Oct
   /att-0017/speechwepapi.html

     [11] http://lists.w3.org/Archives/Public/public-xg-htmlspeech/2011Oct/att-0017/speechwepapi.html

   Michael: some minor questions
   ... default way to get language
   ... end event
   ... end event
   ... 7.1 set language
   ... default values with an attribute - what about language for the
   default value

   Satish: pick up from the document

   Michael: how to phrase?

   Satish: take language mentioned in the markup - if not markup, take
   off the body tag
   ... reco element, can specify lang attribute

   Michael: user agent sets if not otherwise set?

   Satish: can mandate something in the API - when connect to the
   speech service
   ... page specified, for example.
   ... optional in the API - can pick up in the UI

   Michael: When UA communicates with speech service, needs to
   communicate language.
   ... can go off html body language
   ... value assigned to that attribute in javascript?
   ... assigned at open? Can JavaScript get it?
   ... only for user setting?
   ... only for communicating with the speech service.

   Ollie: Assume null unless set.
   ... UA would send default

   Satish: agree with that also - gets sent in the protocol

   Debi: Script could change?

   Satish: yes, could change.

   Dan: But script can't read the default value.
   ... seems like a failure

   Satish: quite common
   ... e.g., width of element.

   Dan: try to get value before set?

   Satish: empty object or null string or empty string.

   Dan: Go with it if that's how it works.

   Charles: Should check that behavior.

   Satish: can check

   Michael: Section 7.3, end of it
   ... from incorporating idea to move away from binding connection - 4
   methods
   ... and 3 event handlers
   ... open start and end events.
   ... what is end for?
   ... disconnect as a result of abort?

   Ollie: Need progress event
   ... load end event after load
   ... always an end event

   Michael: get when end of reco is done
   ... no matter what the cause.

   Satish: lang attribute - don't assign, then get empty string. Assign
   and get the same value.

   Michael: what about width?

   Satish: Thought lang more appropriate.

   Charles: Try a known attribute to avoid custom attribute behavior.

   Satish: will check.

   Michael: end event - fired at the end of the connection no matter
   what the cause.

   Dan: when end of reco is done, no matter what the cause.

continuous recognition -- alternates

   Satish: alternates
   ... what was missing in the first proposal?
   ... can add what was missing or add alternates in the most recent
   proposal.

   Michael: Talk about the current proposal - missing something?
   ... Did talk about alternates.

   Satish: When results not finalized yet. Get when finalized? Don't
   see that.

   Michael: Still have alternates. Contained in the triple.
   ... have n-best alternate list, even with final results.

   Satish: Alts can span word boundaries?

   Michael: Yes - no word boundaries.

   Satish: Can't have one alternate that spans more than one result?

   Michael: Can't change the number of results.
   ... easiest to write up - Milan had longer example.
   ... select or highlight word or phrase - might want alternates to
   pop up for correction - this is supported.
   ... different orders would be more difficult.
   ... to change boundaries, need giant correct
   ... when with n-best correction mechanism.

   Satish: OK if we have alternates for final results.
   ... seems fine
   ... have one example that shows alternates for final results.

   Milan: Questions: how to represent finalized elements?
   ... every item in the array has an attribute for final (boolean)?

   Michael: Yes.

   Milan: want to index from 0?

   Michael: Yes, from 0.

   Milan: will send out new example.
   ... finals with boolean flag, plus alternates.

   Dan: Ollie brought up Glens proposal.
   ... didn't discuss so far.
   ... can walk through?
   ... reco from text string?

   Michael: Don't have it - at least not directly.
   ... can reco from uri - can use data uri.
   ... not direct or straightforward.

   <satish> Regd. reading back a lang attribute that was not yet set -
   I tested with the css width property and reading back a width
   attribute that was not set returns an empty string.

   Dan: Obvious and simple to do or not.

   <satish> HTML without width set: <body
   onload='alert(document.getElementsByTagName("div")[$1\47].style.widt
   h)'><div>Hello</div></body> - shows a dialog with an empty string

   Debi: want to set parameter recognized from text.

   <satish> HTML with width set: <body
   onload='alert(document.getElementsByTagName("div")[$1\47].style.widt
   h)'><div style='width:100px;'>Hello</div></body> - shows a dialog
   with "100px" as the text

   Michael: May interfere.

   Robert: Can do other things.
   ... call emulation rather than start that uses text property?

   Dan: like that.
   ... important for it to be obvious that coder is not using audio.
   ... needs to be clearly different.
   ... emulate good way to do it.

   Debi: Don't need to worry about parameters that don't make sense
   such as end-point detection.

   Dan: But doesn't hurt for them to be there.

   Robert: Can ignore them.

   Dan: attributes, not parameters, so it can work.
   ... simple to do? write up 1 new method and description -
   ... certain parameters ignored or have certain values.
   ... e.g., result has confidence of 100?

   Debi: no - might be doing some parsing, although reco confidence
   could be 100.

   Michael: might not be 100 even if emulated.

   Robert: have 3 different emulation implementations - leave as open
   as possible.

   Debi: looks like start method.
   ... events that come back are the same
   ... ones that don't make sense don't come back.

   Dan: semantic interpretation - would rather not have onsoundstart,
   etc. come back.
   ... different with emulation.
   ... with substantial emulation, could be parameters for everything.

   Debi: make as similar as possible and fine tune later.

   Dan: what would be the harm if onsoundstart, etc. came back.
   ... should know what that means if they call the emulate method.
   ... just need to worry about confusion that it was from a start.

   Robert: uneasy about emulation spec

   Dan: mrcp needs semantic interpretation only.

   Robert: not everyone wants to do that.

   Dan: one shot case should not be a problem.
   ... what about custom pronunciations, etc.
   ... but can be useful in common cases.

   Robert: those are the common cases.
   ... people choose unusual names for things.

   Dan: if speaking, then normal recognition.

   Robert: Simulating spoken input.

   Milan: want all methods?

   Robert: Don't - don't think they are meaningful.
   ... wouldn't want to fire onaudiostart unless there is a system that
   can represent them.

   Dan: Would like interpret rather than emulate.
   ... utf-8 text and interpret it.

   Robert: Find semantic interpretation to be almost useless in the
   vast majority of apps.

   Milan: app with SRGS grammar, want to send text to it.
   ... doesn't help with punctuation, etc.
   ... must match tokens.

   Michael: emulation can do something smart.

   Robert: can do something like that - works perfectly when testing,
   but confusable parts in reality with audio.
   ... won't get n-best, etc.

   Michael: given range of target users - worth having mechanism to
   send a string and get an interpretation.
   ... experimenting for own site.
   ... be careful how we describe the method.
   ... ask for confusable parts.
   ... be careful about what we say we get back.
   ... Sounds like there is rough agreement for emulate or "recognize
   from text" method.

   Dan: not emulate.

   Robert: interpret.

   Dan: clear if pass in as a parameter.

   Michael: do want to trigger event - can get nomatch.

   Dan: difference between result events and progress (audio related)
   events.
   ... would rather not get audio-related events.

   Michael: Can get other result oriented events.
   ... oninterpret?

   Dan: needs to be an end - maybe not oninterpret.

   Michael: oninterpret similar to onstart - make changes while waiting
   for results to come back.
   ... might change the UI while interpreting.

   Dan: Could take time - fair enough. In favor of oninterpret.

   Satish: Don't set a width - get empty string. Set and then get what
   you set.
   ... in meeting notes.

   Michael: that approach should work for language.

   Satish: yes.

   <glen>
   [12]http://lists.w3.org/Archives/Public/public-xg-htmlspeech/2011Oct
   /0000.html

     [12] http://lists.w3.org/Archives/Public/public-xg-htmlspeech/2011Oct/0000.html

Glen's proposal.

   <burn> above URI is for Glen's reco tag proposal

   Glen: Be more declarative.
   ... simple for html developers.
   ... support copy and paste.
   ... keep simple things simple.
   ... additional option as a permissions model.
   ... icon that indicates "I'm speech enabled".
   ... user gives permission by clicking.
   ... make it obvious when audio is captured.
   ... says that this is a spot where you can speak into a Web page.
   ... example 1 - no specific binding except in JavaScript except
   onresult.
   ... methods intended to match what's in JavaScript.
   ... example 4 - does appending rather than overwriting.
   ... example 3 - assigns value and submits the form.
   ... apply the same technique for text areas.

   Charles: JavaScript loses declarative approach.

   Michael: Not as simple this.value. Have n-best, etc.

   Glen: Can have a simple syntax for the top result.
   ... either way simple cut and paste.
   ... continuous vs. noncontinuous.
   ... interim=true

   Michael: "for" lets you choose grammars.
   ... type number rather than type text.

   Glen: Two reco elements. ... icon that you click on.

   Satish: JavaScript needed for more complex things.
   ... want to submit form, etc.
   ... never purely declarative.

   Charles: Should be some simple declarative cases.

   Michael: compromise by being both.
   ... add various attributes.
   ... parameters and event handlers seems useful.
   ... can do without losing for connection.

   Glen: "for" connection - can eliminate one bit of JavaScript.
   ... don't see as strong argument.
   ... get grammar also.
   ... might want to constrain grammar to zip code or other specific
   things.

   Michael: HTML5 pattern attribute.
   ... get checking and can tie to speech.
   ... ties the reco to the input with "for" just like label.
   ... can also be implicit (by wrapping).
   ... examples would be tied to the input.

   Robert: Simply assigning a value to a text field can be
   accomplished.
   ... what if want to advance the cursor or insert text.
   ... should just be built in.

   Glen: Don't know if can generalize if only inserted at the cursor.

   Satish: Keep the API as an API.
   ... not try to implement text input.

   Michael: Web app author says pattern and that they want speech.
   ... saying UA will speech enable, but we are saying the Web app
   author will specify grammars, patterns.
   ... would like association with markup.
   ... want web app author to have some control.
   ... don't want to lose "for" attribute.
   ... can use both approaches.

   Olli: pattern attribute is a regular expression. How would speech
   services handle that?
   ... thought a grammar would be needed.

   Michael: UA should give information about a relevant grammar.
   ... no grammars in the examples from Glen.

   Olli: if support patterns and automatic binding, need to specify how
   this works.

   Milan: Built-in grammar that supports patterns?

   Michael: Voice-xml has parameterized grammar specifications.
   ... don't match with html5, so need to do the mapping.
   ... OK with not making the mapping as long as browsers are able to
   do this.

   Satish: "for" attribute problem - have so many things to support.
   ... for is mostly visual
   ... needs to say which controls for speech.

   Glen: Can get grammar from input field with "for" - big advantage.
   ... otherwise don't see advantage.
   ... "for" could apply to input fields only,.

   Michael: button, input, text area, etc. specified so far.
   ... based on html5.

   Charles: Model forces user to select the "mic" image.

   Glen: Keep simple things simple.
   ... Show microphone or not?

   Charles: Would be an orthogonal parameter.

   Glen: Visual element to know it's speech enabled and for permission
   model.

   Dan: in Glen's model icon shows up. User clicks on it and goes into
   a different field.
   ... nice to see cursor in field to know where input goes.

   Michael: "for" is very powerful.

   Satish: Could have reverse mapping.
   ... taking result and putting it in there.
   ... list things
   ... do we need a "for".
   ... onfocus - to avoid click.

   Michael: Confusing if one reco element that applies to different
   input element.

   Satish: 5 different input fields - want to do with all. Want
   different UI.

   Dan: "for" is optional.

   Michael: Says what to do in a form.
   ... can use reco outside of a form.

   Satish: what will be use case for multiple inputs and reco tags.

   Michael: Can have associated with each input element. Can be
   appropriate.

   Dan: Good discussion, but not close to resolving.
   ... should be more discussion.
   ... possibly more phone call time next week.
   ... send e-mail on the list.
   ... very important to decide on this.

   Glen: TPAC reminder - deadline tomorrow.

   Michael: deadline extended one week.

   <glen> [13]http://www.w3.org/2011/11/TPAC/

     [13] http://www.w3.org/2011/11/TPAC/

   <glen> Registration fee goes up tomorrow Oct 14

   <smaug> glen: nope

   <smaug> it was extended to Oct 21, IIRC
Received on Friday, 14 October 2011 15:36:18 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Friday, 14 October 2011 15:36:19 GMT