RE: Split TTS and Speech Recognition? from Young, Milan on 2013-12-11 (public-speech-api@w3.org from December 2013)

From: Young, Milan <Milan.Young@nuance.com>
Date: Wed, 11 Dec 2013 17:50:46 +0000
To: Doug Schepers <schepers@w3.org>
CC: "Raj (Openstream)" <raj@openstream.com>, Glen Shires <gshires@google.com>, Web Speech <public-speech-api@w3.org>, "Bjorn Bringert (bringert@google.com)" <bringert@google.com>
Message-ID: <B236B24082A4094A85003E8FFB8DDC3C20CAEBDF@SOM-EXCH03.nuance.com>
Hello Doug,

Sorry my comment about a "unified spec" unclear.  I meant a standards-track spec in contrast to that currently being developed in this CG.

On the main topic of splitting, I don't have a strong opinion.  If there are advantages to keeping the specs together, they are surely dwarfed by the greater advantages in having all browser vendors participate in a common (I'll avoid "unified") forum.

Much of what defeated the previous WG charter was the presence of this CG.  Browser vendors were, perhaps wisely, not willing to enter into another standards battle.  To them, the notion of competing standards meant the space was not yet ready for investment of their resources.  I believe maintaining this CG's commitment to transition to a WG would be sufficient to attract these missing vendors.  This will surely mitigate some of Google's control over the standard, but I believe the enhanced visibility would more than offset these losses.

Regards


> From: Doug Schepers [mailto:schepers@w3.org]
> Hey, Milan–
> 
> On 12/9/13 3:41 PM, Young, Milan wrote:
> > Please excuse the late response.  I have not been actively monitoring
> > this list for some time.
> >
> > Contrary to Glen's assertion, I believe a unified spec would indeed
> > accelerate implementation.
> 
> Just to be clear, I was suggesting that we should split the specs; Glen favored
> keeping them unified.
> 
> I've hear rumors that TTS is landing in Chrome; I don't know about ASR.
> If TTS is moving faster in implementations, I still think it makes sense to split
> them.
> 
> But again, I'm just putting out a trial balloon to see if there's support for the
> notion. I don't have strong suggestions about where such a spec would land.
> 
> Regards-
> -Doug
> 
> > Speaking for Nuance, a global leader in the field of both recognition
> > and TTS, we would gladly begin implementation if the spec were
> > sanctioned under a WG.  Splitting recognition from TSS on a temporary
> > or even permanent basis seems like a small price to pay for this
> > greater good.
> >
> > Regards
> >
> >
> >> -----Original Message----- From: Raj (Openstream)
> >> [mailto:raj@openstream.com] Sent: Wednesday, October 09, 2013 4:49
> AM
> >> To: Doug Schepers; Glen Shires Cc: Web Speech Subject: Re: Split TTS
> >> and Speech Recognition?
> >>
> >> Speaking from my vantage position, I find both the arguments
> >> plausible, recognizing that more work needs to be done before the
> >> current artifacts become SPECs.
> >>
> >> To GLEN's point, implementors can still implement part of the SPEC (
> >> and it could be just TTS).. and yes, there are plenty of use-cases (
> >> again for a web developer) for just using TTS in the apps.
> >>
> >> It's not clear to me, how and why keeping them in "SYNCH" would be a
> >> better thing to do..( aside from the convenience of reading one spec
> >> as opposed to two)...and at the same time, not sure how splitting
> >> them into two, would make it more attractive/likely for any other
> >> group to absorb...
> >>
> >> IMHO, implementors can take any portion of any spec and conform to
> >> the extent of their capability and desire... and so can WGs..
> >>
> >> But, yes, it'll continue to be frustrating that we have so many
> >> "SPECs" that are not standards from a developers'/implementors'
> >> point of view.
> >>
> >> Raj
> >>
> >> On Wed, 09 Oct 2013 05:25:10 +0200 Doug Schepers <schepers@w3.org>
> >> wrote:
> >>> Hi, Glen–
> >>>
> >>> I'm not trying to be pesky about this, and I'm not going to get
> >>> pushy. But I'd like you to reconsider this, and I'd like to hear
> >>> from others what they think (especially implementers).
> >>>
> >>>
> >>> On 10/8/13 8:40 PM, Glen Shires wrote:
> >>>> A unified spec hasn't slowed implementations, as there are
> >>>> currently browsers that implement the ASR portion and not the TTS
> >>>> portion, and browsers that implement the TTS portion and not the
> >>>> ASR portion.
> >>>
> >>> This would seem to be an argument for splitting them up, not keeping
> >>> them together. They are moving at different rates.
> >>>
> >>>
> >>>> (And speech aside, there are many examples where implementors
> >>>> implement a spec in parts.)
> >>>
> >>> Yes, but this is not good for web developers. It's to be avoided, if
> >>> possible. With my web developer hat on, this is really frustrating.
> >>> This is why CSS took a more modular approach, which is working
> >>> pretty well in terms of consistency and interoperability.
> >>>
> >>>
> >>>> Also, keeping TTS and ASR together avoids the problem of having to
> >>>> sync  things up in the future.
> >>>
> >>> Speaking from a position of ignorance and curiosity, what things
> >>> need to be synced up between TTS and ASR? They seem pretty
> >>> orthogonal from my reading of the spec.
> >>>
> >>>
> >>>> As the unified spec matures, it may have a  better chance of
> >>>> finding a unified home in one of the major W3C groups,  such as
> >>>> HTML.
> >>>
> >>> I'm not sure I follow your reasoning there. Why would a single spec
> >>> have a better chance of being adopted by a WG than 2 smaller specs?
> >>>
> >>>
> >>> Is there some concern that one would get implemented, and not the
> >>> other, so keeping them together might incent implementers to do
> >>> both?
> >>>
> >>>
> >>> Finally, I just want to be clear that this request is not me
> >>> speaking with my W3C hat on; I'm speaking solely as an interested
> >>> web developer who wants his apps to work in as many browsers as
> >>> possible, and who's mostly using the TTS stuff.
> >>>
> >>> Regards- -Doug
> >>>
> >>>
> >>>> Glen
> >>>>
> >>>>
> >>>> On Tue, Oct 8, 2013 at 9:28 AM, Doug Schepers <schepers@w3.org
> >>>> <mailto:schepers@w3.org>> wrote:
> >>>>
> >>>> Hi, folks–
> >>>>
> >>>> I'd like to propose that the text-to-speech feature be split out
> >>>> from the Web Speech API spec; it's more or less orthogonal with the
> >>>> speech recognition aspect of the spec, and while there are still
> >>>> open issues that are being discussed, I think it's more stable in
> >>>> terms of implementations, and could move forward more quickly on
> >>>> its own.
> >>>>
> >>>> I have been using both TTS and speech recognition in some of my
> >>>> recent apps, and I think both are very cool and useful; I think
> >>>> both will be great for accessibility, as well. TTS is much simpler,
> >>>> though, and I think we could get more implementations right away if
> >>>> we split it out. I really want to see both succeed, at their own
> >>>> pace.
> >>>>
> >>>> (As an aside, I made a "talking calculator" back in 2004 using SVG
> >>>> and the Microsoft IE TTS API; it no longer works, but it hints to
> >>>> me that it wouldn't be too hard for Microsoft to implement the more
> >>>> modern TTS functionality in IE, if the path ahead were clear for
> >>>> them.)
> >>>>
> >>>> In light of the recent news that the W3C Web Speech WG is not going
> >>>> to be formed [1], I think the work should still be done in the Web
> >>>> Speech Community Group, though maybe when it's mature enough, it
> >>>> could move to an existing W3C WG to become a Recommendation.
> >>>>
> >>>> (I don't have a strong feeling about which group this might fit in,
> >>>> but a few spring to mind: the WebApps WG, the Audio WG, or the HTML
> >>>> WG to take advantage of the new CC-BY licensing being experimented
> >>>> on there. It could even be its own WG, though that seems like
> >>>> overkill to me.)
> >>>>
> >>>> If any of this resonates with this group, I'm happy to help with it
> >>>> unofficially, with my W3C staff experience. (If it were ultimately
> >>>> moved into the Audio WG, then I could give my official help, since
> >>>> that's one of my working groups. :P)
> >>>>
> >>>> [1] http://lists.w3.org/Archives/__Public/public-new-

> >> work/__2013Oct/0004.html
> >>>>
> >>>> <http://lists.w3.org/Archives/Public/public-new-work/2013Oct/0004.h

> >>>> tm
> >>>>
> >>>>
> l>
> >>>>
> >>>> Regards- -Doug
> >>>>
> >>>>
> >>>
> >>>
> >>>
> >>
> >> -- NOTICE TO RECIPIENT: THIS E-MAIL IS  MEANT FOR ONLY THE INTENDED
> >> RECIPIENT OF THE TRANSMISSION, AND MAY BE A COMMUNICATION
> PRIVILEGED
> >> BY LAW.  IF YOU RECEIVED THIS E-MAIL IN ERROR, ANY REVIEW, USE,
> >> DISSEMINATION, DISTRIBUTION, OR COPYING OF THIS E-MAIL IS STRICTLY
> >> PROHIBITED.  PLEASE NOTIFY US IMMEDIATELY OF THE ERROR BY RETURN
> >> E-MAIL AND PLEASE DELETE THIS MESSAGE FROM YOUR SYSTEM.
> >> THANK YOU IN ADVANCE FOR YOUR COOPERATION. Reply to :
> >> legal@openstream.com
> >>
> >
>
Received on Wednesday, 11 December 2013 17:51:16 UTC