Fwd: minutes from Dan Burnett on 2010-11-03 (public-xg-htmlspeech@w3.org from November 2010)

From: Dan Burnett <dburnett@voxeo.com>
Date: Wed, 3 Nov 2010 05:26:31 -0400
To: public-xg-htmlspeech@w3.org
Message-Id: <87830672-555F-4B52-945E-489656C6387A@voxeo.com>
These are the raw minutes from the first session yesterday.
I will update the names when I consolidate all the minutes after  
Thursday's meeting.

-- dan

Begin forwarded message:

> From: "Jim Barnett" <Jim.Barnett@alcatel-lucent.com>
> Date: November 2, 2010 10:24:25 AM EDT
> To: "Dan Burnett" <dburnett@voxeo.com>
> Subject: minutes
>
> Attendees: Paolo Baggia,  Ingmar Kliche, Debbie Dahl,  Michael  
> Bodel,  Dave Burke,  Dan Burnett, Matt Womer,
> Milan Young (observer),  Kazuyuki Ashimura, Rahul Akolkar,  
> Oliver ??,  Dan Druta (??), Robert Brown,
> Satish Sampat, Bjorn Bringert (??), Jeong Park, (???) Chu, Sinja(??)  
> Lee, Jim Barnett (observer)
>
> Initial discussion is a review of the charter and goals of the  
> group. This group is an incubator
> group and will not produce recommendation-track documents, but will  
> produce proposals for
> recommendations.
>
> Requirement R29.  "Web application may only listen in response to  
> user action"
> Bjorn: the point of this is security and privacy, don't want the  
> browser snooping without
> the user's knowledge.  There's been discussion of what sort of user  
> action is needed.
> Debbie: Do we want explicit end user consent instead?
> Dan Druta: How long does the consent last for?  Notification might  
> be better, that let's the
> user know that speech is activated.
> Dave Burke:  That's necessary but not sufficient.
> Michael: R32 covers notification.
> Dave Burke: There is precedent in file access, camera access,  
> microphone access, for how to
> do this.  We should think about how it fits into the web platform.  
> Can't  be too annoying to the user.
> Michael:  We still need to think about what the user action is.  Is  
> it browsing to a page,
> clicking a link?  There are cases where there aren't many visual  
> cues possible.
> Satish: could we leave it up to the browser what the user action  
> is?  In a handsfree app,
> the user action might be quite different from that in a normal web  
> browser.
> Debbie:  But we do want to keep random websites from recording your  
> speech.  I had assumed
> that this was clicking a button.
> Michael:  That is too specific.
> Dan B:  is this a requirement on the web application or on the  
> browser?  The web app may
> not begin recording speech without explicit permission from browser,  
> where browser is the
> proxy for the user.
> Debbie:  This is a requirement for the ultimate spec.
> Bjorn: The requirement is on the web app.  It is not allowed to  
> start listening on its own.
> Michael:  It is also a requirement on the user agent.
> Bjorn: A user agent should be able to do whatever it wants.
> Michael:  If my home page is a voice search page, I don't want to  
> get asked for permission
> each time I go there.
> Michael:  We may have agreement on the following:  The user agent  
> may not allow the application
> to record without user consent.   We may want to say more than this,  
> there may be requirements
> on how the user agent gets user consent, but we don't agree on this  
> yet.
> Dave:  file access is a good example.  A web page cannot open a file  
> unless user clicks on button
> to indicate he wants file access.  Don't need explicit permission  
> beyond this.
> Robert: In a medical records application, you will want voice  
> control over everything.  The
> app is built for you to talk to it.
> Bjorn: So the question is whether you have to give consent each  
> time, or is once enough?  How
> long does consent persist.  3 levels:  Have to click for every  
> utterance, have to click once
> each time you load the page, or click once and grant access for  
> ever.  Another question is
> whether there is a button in the page, or one in the browser chrome.
> Michael: There are a variety of trust relations between the user and  
> a browser.
> Robert:  Is this different from all the other things we have to set  
> policy on it browsers?
> Michael: Speech is a hybrid:  it's sort of a resource like a cookie,  
> but it's also a form
> of input.
> Debbie: Consider case where you give a page permission to listen to  
> me, you don't want it
> to listen to you when you're talking to your friend.
> Bjorn: Only page that's in focus should be able to record.  A  
> background tab shouldn't  be
> able to record.  Could have a requirement that user sees recognized  
> input before it is submitted
> to page, but that could get clumsy.
> Dan Druta:  We should have another use case covering initiation of  
> speech including user
> granting permission.
> Dan Burnett: To add another requirement, have to present specific  
> use cases to go with it.
> Robert:  R29 should use "capture" rather than "recognize".
> <end of discussion on this requirement>
>
> R27.  No one believes that we need more than what SSML provides for  
> audio.
> Michael: Are there people who want less than full SSML?
> (It appears that there are not.)
> Milan: Maybe we should say "use standards when available" becuase  
> there is no standard
> for statistical LMs, and you don't want to be banned from using them.
> Dan B: yes we may have to add a "where available" clause in 27a and  
> 27b.
> We may want to say: implementations _must_ support SRGS and SISR and  
> the API
> must not prohibit the use of non-standard formats.
> Robert:  SRGS has to formats: ABNF and XML.  Which one do we mean?
> Bjorn: VXML requires XML format.  Shall we do the same?
> Dan B: 27a: "implementations must support the XML format of SRGS and  
> must support SISR"
> 27b: "implementations must support SSML"  <There is agreement on  
> this wording>
>
>
>
> CONFIDENTIALITY NOTICE: This e-mail and any files attached may  
> contain confidential and proprietary information of Alcatel-Lucent  
> and/or its affiliated entities. Access by the intended recipient  
> only is authorized. Any liability arising from any party acting, or  
> refraining from acting, on any information contained in this e-mail  
> is hereby excluded. If you are not the intended recipient, please  
> notify the sender immediately, destroy the original transmission and  
> its attachments and do not disclose the contents to any other  
> person, use it for any purpose, or store or copy the information in  
> any medium. Copyright in this e-mail and any attachments belongs to  
> Alcatel-Lucent and/or its affiliated entities.
Received on Wednesday, 3 November 2010 09:27:11 UTC