- From: Dan Burnett <dburnett@voxeo.com>
- Date: Thu, 12 May 2011 13:40:12 -0400
- To: public-xg-htmlspeech@w3.org
Group,
The minutes from today's call are available at http://www.w3.org/2011/05/12-htmlspeech-minutes.html 
.
For convenience, a text version is embedded below.
Thanks to Dan Druta for taking the minutes!
-- dan
***************************************************************
Attendees
    Present
           Dan_Burnett, Bjorn_Bringert, Milan_Young, Michael_Bodell,
           Robert_Brown, Dan_Druta, Debbie_Dahl, Charles_Hemphill,
           Olli_Pettay, Michael_Johnston, Patrick_Ehlen
    Regrets
           Marc_Schroeder
    Chair
           Dan Burnett
    Scribe
           Dan_Druta
Contents
      * [4]Topics
          1. [5]Updated final report draft
          2. [6]Design Decisions with agreements
          3. [7]Issues discussed in the appendix
          4. [8]Audio Codecs
          5. [9]F2F Logistics
      * [10]Summary of Action Items
      _________________________________________________________
    <burn> trackbot, start telcon
    <trackbot> Date: 12 May 2011
    <bringert_> I'm having connectivity issues
    <bringert_> and it looks like I'm in here twice
    <ddahl> bjorn, we can hear you
    <bringert> ok, I can't hear anyone else
    <bringert> try a different connection
    <bringert> trying
    <burn> Scribe: Dan_Druta
    <burn> ScribeNick: DanD
    <burn> Agenda:
    [11]http://lists.w3.org/Archives/Public/public-xg-htmlspeech/2011May
    /0005.html
      [11] http://lists.w3.org/Archives/Public/public-xg-htmlspeech/2011May/0005.html
Updated final report draft
    burn: Made a few changes
    ... Topic: Design Decisions with agreements
Design Decisions with agreements
    Burn: Any new items we agree on?
    ... No design decisions?
    ... Any other topics to be discussed later?
Issues discussed in the appendix
Audio Codecs
    robert: We don't think we should agree on codec. We should look at a
    few items: One by bandwidth, ip issues,
    ... there are trade offs
    ... fidelity is another issue
    burn: We want the ideal codec but there no such thing
    ... Opus is a combination of codecs and an attempt to provide an
    industry standard
    Milan: RTCWeb is looking at Opus
    burn: The issue is which audio codecs is mandatory to support
    mbodell: The question is if you can recognize an audio file
    Milan: is the synthesizer also part of this?
    Bringrt: Three items: 1. Codecs use for remote speech engine
    <bringert> 1. codecs used between browser and web app specified
    recognizer
    Milan: 2. Codec use for file speech
    <bringert> 2. codecs used between web app and browser for
    recognition of existing audio
    <smaug> there is terrible echo now
    <bringert_> 3. codes used between browser and web app specified
    synthesizer
    mbodell: we should allow other codec to be used
    Milan: Sounds like requirements
    robert: Microsoft uses SIREN owned by Polycom.
    burn: Voxeo support all
    <bringert> Google uses Speex, FLAC and AMR
    Milan: Opus has the notion to cutoff audio and saves bandwidth
    ... speech has a critical requirement to capture the first part
    burn: There are several codecs in Opus
    ... There was an attempt to merge
    Michael: is the issue of support in mobile devices (hardware)
    ... for the mobile browsing we can rely on hardware and fall back
    bringert: The one codec that has must support is Speex
    ... Caution - there's no container format
    burn: another issue is transport (framing)
    Milan: isn't an IETF standard
    burn: It will require some sort of support for RTP
    ... How much SIP support will be needed?
    ... There's disagreement and not everybody want a full SIP stack
    bringert: how about OGG?
    <bringert> Speex codec in OGG container
    <burn> s/disarrangement/disagreement/
    burn: It is appropriate not to commit yet and review next week
    Milan: It would be useful to know streaming
    mbodell: Add a forth item to the list of elements: support for
    streaming
    Milan: can we agree that the architecture should support streaming?
    bringert: I'm fine if we support streaming before the engine starts
    processing
    Milan: Recognizer should be able to return results before the end of
    speech
    burn: Recognizer should be able to return final result before the
    end of speech
    bringert: This rules out HTTP
    mbodell: You can't get duplex but you can get intermediary responses
    Milan: The client can chunk up responses
    ... Is it a violation if we use web sockets?
    <bringert> I'm muted
    <smaug> burn: we don't seem to have scribe anymore
    burn: We need to be careful not to go in a different direction from
    RTCWeb
    mbodell: different protocol for different use cases
    ... http works well for certain cases
    robert: we don't want to over complicate
    ... RTC has a different set of requirements
    burn: you are right
    bringert: We have two choices: we go with http and add RTCweb
    robert: or web sockets
    bringert: is anyone opposing support for HTTP?
    ... for streaming
    ... We support it in Chrome 11
    ... We want to have http used for other interactions between the
    user agent and server
    mbodell: It's not just audio if we understand correctly
    ... different apps would use different approaches
    burn: we can't predict how it will be used
    Milan: there's a continuous response
    robert: I'd like to see a proposal before we agree
    <bringert>
    [12]http://tools.ietf.org/html/draft-zhu-http-fullduplex-02
      [12] http://tools.ietf.org/html/draft-zhu-http-fullduplex-02
    Milan: I agree with a solution that uses HTTP as a basic but not
    full solution
    robert: I would not call Web Sockets HTTP and I'd like to see a
    proposal
    bringert: We should be able to use HTTP
    burn: We are saying we are mandating HTTP not eliminating the
    potential support for other
    bringert: the server does not know what's supported on the browser
    robert: we need some discovery capability
    burn: We believe Web Sockets will not be mandated for support
    Milan: I'm not asking for that but a solution for bidirectional
    support
    ... if HTTP can do bidirectional we're fine
    bringert: there's no reason not support HTTP.
    <burn> bringert: would love bidirectional support if we had a good
    solid candidate for it
    Milan: Instead of saying HTTP is required let's list the elements
    bringert: We should require HTTP
    burn: Agreement - we require http support for all communications and
    allow for others
    mbodell: I'd like to have a solution for bidirectional support but
    we should not block the spec
    burn: other topics around codecs?
    mbodell: some audio codecs that support audio and video
    ... recognize audio from a video+audio stream
    bringert: I would suggest we don't send video to reduce bandwidth
    ... if we don't have strong use cases we should not add it to the
    spec
    ... Should we disallow sending video?
    burn: no agreements and the best way is not to make any other
    statements
    ... add this to the list of topics
    ... nobody is talking about gesture recognition just audio
    ... we will get back to this
    ... Other items related to codecs?
    Milan: are there any other candidates:
    burn: OPUS. Big but with support for different use cases
    <mbodell>
    [13]http://en.wikipedia.org/wiki/Comparison_of_audio_codecs
      [13] http://en.wikipedia.org/wiki/Comparison_of_audio_codecs
F2F Logistics
    bringert: no updates
    ... I will come back with directions from the hotel to the offices
    ... We sent the directions from the airport
    ... everybody should have gotten the email
    burn: it would still be good if we have some directions from hotel
    to the Google offices
    ... one more call before the f2f
    bringert: There's a statement about the agreement on the user
    interface that is not well captured
    burn: Yes, I somehow dropped the most important decision -- that it
    must NOT be possible to customize the part of the user interface
    that indicates the microphone is open. I will add that in.
Received on Thursday, 12 May 2011 17:56:12 UTC