Re: Split TTS and Speech Recognition? from Doug Schepers on 2013-12-11 (public-speech-api@w3.org from December 2013)

From: Doug Schepers <schepers@w3.org>
Date: Wed, 11 Dec 2013 03:49:21 -0500
To: "Young, Milan" <Milan.Young@nuance.com>
CC: "Raj (Openstream)" <raj@openstream.com>, Glen Shires <gshires@google.com>, Web Speech <public-speech-api@w3.org>
Message-ID: <52A82711.5000806@w3.org>
Hey, Milan–

On 12/9/13 3:41 PM, Young, Milan wrote:
> Please excuse the late response.  I have not been actively monitoring
> this list for some time.
>
> Contrary to Glen's assertion, I believe a unified spec would indeed
> accelerate implementation.

Just to be clear, I was suggesting that we should split the specs; Glen 
favored keeping them unified.

I've hear rumors that TTS is landing in Chrome; I don't know about ASR. 
If TTS is moving faster in implementations, I still think it makes sense 
to split them.

But again, I'm just putting out a trial balloon to see if there's 
support for the notion. I don't have strong suggestions about where such 
a spec would land.

Regards-
-Doug

> Speaking for Nuance, a global leader in
> the field of both recognition and TTS, we would gladly begin
> implementation if the spec were sanctioned under a WG.  Splitting
> recognition from TSS on a temporary or even permanent basis seems
> like a small price to pay for this greater good.
>
> Regards
>
>
>> -----Original Message----- From: Raj (Openstream)
>> [mailto:raj@openstream.com] Sent: Wednesday, October 09, 2013 4:49
>> AM To: Doug Schepers; Glen Shires Cc: Web Speech Subject: Re: Split
>> TTS and Speech Recognition?
>>
>> Speaking from my vantage position, I find both the arguments
>> plausible, recognizing that more work needs to be done before the
>> current artifacts become SPECs.
>>
>> To GLEN's point, implementors can still implement part of the SPEC
>> ( and it could be just TTS).. and yes, there are plenty of
>> use-cases ( again for a web developer) for just using TTS in the
>> apps.
>>
>> It's not clear to me, how and why keeping them in "SYNCH" would be
>> a better thing to do..( aside from the convenience of reading one
>> spec as opposed to two)...and at the same time, not sure how
>> splitting them into two, would make it more attractive/likely for
>> any other group to absorb...
>>
>> IMHO, implementors can take any portion of any spec and conform to
>> the extent of their capability and desire... and so can WGs..
>>
>> But, yes, it'll continue to be frustrating that we have so many
>> "SPECs" that are not standards from a developers'/implementors'
>> point of view.
>>
>> Raj
>>
>> On Wed, 09 Oct 2013 05:25:10 +0200 Doug Schepers <schepers@w3.org>
>> wrote:
>>> Hi, Glen–
>>>
>>> I'm not trying to be pesky about this, and I'm not going to get
>>> pushy. But I'd like you to reconsider this, and I'd like to hear
>>> from others what they think (especially implementers).
>>>
>>>
>>> On 10/8/13 8:40 PM, Glen Shires wrote:
>>>> A unified spec hasn't slowed implementations, as there are
>>>> currently browsers that implement the ASR portion and not the
>>>> TTS portion, and browsers that implement the TTS portion and
>>>> not the ASR portion.
>>>
>>> This would seem to be an argument for splitting them up, not
>>> keeping them together. They are moving at different rates.
>>>
>>>
>>>> (And speech aside, there are many examples where implementors
>>>> implement a spec in parts.)
>>>
>>> Yes, but this is not good for web developers. It's to be avoided,
>>> if possible. With my web developer hat on, this is really
>>> frustrating. This is why CSS took a more modular approach, which
>>> is working pretty well in terms of consistency and
>>> interoperability.
>>>
>>>
>>>> Also, keeping TTS and ASR together avoids the problem of having
>>>> to sync  things up in the future.
>>>
>>> Speaking from a position of ignorance and curiosity, what things
>>> need to be synced up between TTS and ASR? They seem pretty
>>> orthogonal from my reading of the spec.
>>>
>>>
>>>> As the unified spec matures, it may have a  better chance of
>>>> finding a unified home in one of the major W3C groups,  such as
>>>> HTML.
>>>
>>> I'm not sure I follow your reasoning there. Why would a single
>>> spec have a better chance of being adopted by a WG than 2 smaller
>>> specs?
>>>
>>>
>>> Is there some concern that one would get implemented, and not
>>> the other, so keeping them together might incent implementers to
>>> do both?
>>>
>>>
>>> Finally, I just want to be clear that this request is not me
>>> speaking with my W3C hat on; I'm speaking solely as an interested
>>> web developer who wants his apps to work in as many browsers as
>>> possible, and who's mostly using the TTS stuff.
>>>
>>> Regards- -Doug
>>>
>>>
>>>> Glen
>>>>
>>>>
>>>> On Tue, Oct 8, 2013 at 9:28 AM, Doug Schepers <schepers@w3.org
>>>> <mailto:schepers@w3.org>> wrote:
>>>>
>>>> Hi, folks–
>>>>
>>>> I'd like to propose that the text-to-speech feature be split
>>>> out from the Web Speech API spec; it's more or less orthogonal
>>>> with the speech recognition aspect of the spec, and while there
>>>> are still open issues that are being discussed, I think it's
>>>> more stable in terms of implementations, and could move forward
>>>> more quickly on its own.
>>>>
>>>> I have been using both TTS and speech recognition in some of
>>>> my recent apps, and I think both are very cool and useful; I
>>>> think both will be great for accessibility, as well. TTS is
>>>> much simpler, though, and I think we could get more
>>>> implementations right away if we split it out. I really want to
>>>> see both succeed, at their own pace.
>>>>
>>>> (As an aside, I made a "talking calculator" back in 2004 using
>>>> SVG and the Microsoft IE TTS API; it no longer works, but it
>>>> hints to me that it wouldn't be too hard for Microsoft to
>>>> implement the more modern TTS functionality in IE, if the path
>>>> ahead were clear for them.)
>>>>
>>>> In light of the recent news that the W3C Web Speech WG is not
>>>> going to be formed [1], I think the work should still be done
>>>> in the Web Speech Community Group, though maybe when it's
>>>> mature enough, it could move to an existing W3C WG to become a
>>>> Recommendation.
>>>>
>>>> (I don't have a strong feeling about which group this might
>>>> fit in, but a few spring to mind: the WebApps WG, the Audio WG,
>>>> or the HTML WG to take advantage of the new CC-BY licensing
>>>> being experimented on there. It could even be its own WG,
>>>> though that seems like overkill to me.)
>>>>
>>>> If any of this resonates with this group, I'm happy to help
>>>> with it unofficially, with my W3C staff experience. (If it
>>>> were ultimately moved into the Audio WG, then I could give my
>>>> official help, since that's one of my working groups. :P)
>>>>
>>>> [1] http://lists.w3.org/Archives/__Public/public-new-
>> work/__2013Oct/0004.html
>>>>
>>>> <http://lists.w3.org/Archives/Public/public-new-work/2013Oct/0004.htm
>>>>
>>>>
l>
>>>>
>>>> Regards- -Doug
>>>>
>>>>
>>>
>>>
>>>
>>
>> -- NOTICE TO RECIPIENT: THIS E-MAIL IS  MEANT FOR ONLY THE INTENDED
>> RECIPIENT OF THE TRANSMISSION, AND MAY BE A COMMUNICATION
>> PRIVILEGED BY LAW.  IF YOU RECEIVED THIS E-MAIL IN ERROR, ANY
>> REVIEW, USE, DISSEMINATION, DISTRIBUTION, OR COPYING OF THIS E-MAIL
>> IS STRICTLY PROHIBITED.  PLEASE NOTIFY US IMMEDIATELY OF THE ERROR
>> BY RETURN E-MAIL AND PLEASE DELETE THIS MESSAGE FROM YOUR SYSTEM.
>> THANK YOU IN ADVANCE FOR YOUR COOPERATION. Reply to :
>> legal@openstream.com
>>
>
Received on Wednesday, 11 December 2013 08:49:29 UTC