- From: Glen Shires <gshires@google.com>
- Date: Thu, 30 Aug 2012 17:51:13 -0700
- To: Satish S <satish@google.com>
- Cc: "Young, Milan" <Milan.Young@nuance.com>, "public-speech-api@w3.org" <public-speech-api@w3.org>
- Message-ID: <CAEE5bcj0KjfDkO42VsMpU-Jb7QE2fZfMyrjLXV+netqL15CB4g@mail.gmail.com>
If this is an optional flag that we add in the future, I strongly believe the default should be true. (That is, until we add this feature, proper whitespace must be inserted by the speech recognizer.) If a user is searching for a consecutive key-words such as "peanut butter", there is no guarantee that the they will be returned in the same final result. For example: result[0].transcript = "I'd like a peanut" result[1].transcript = "butter sandwich." While there's various algorithms that might be used to find the consecutive key-words, perhaps the easiest is to concatenate the results together with a space input between, and search for "peanut butter". So for this use case, it would be simpler if the speech-recognizer had returned the results with proper white-spacing. But frankly, I think the complexity of writing a JavaScript algorithm that knows how to insert proper whitespaces - and works on a wide variety of international languages, far outweighs any minor simplification of scanning for keywords by ignoring leading/trailing whitespaces. I believe there will be many applications that do use dictation to generate emails, documents, product reviews, etc. So I believe we must ensure that authoring a dictation app should not be more difficult than it needs to be. /Glen Shires On Thu, Aug 30, 2012 at 4:04 PM, Satish S <satish@google.com> wrote: > Stripping whitespace is something that almost every app that doesn't use > the API for dictation would need. To me this looks like an optional > feature, something which gets turned on based on a flag such as > "SpeechRecognition.autoWhiteSpace" that the developer would set if they > want it.. and as such it could be added in a future revision of the API if > we see developers asking for it. > > Cheers > Satish > > > > On Thu, Aug 30, 2012 at 9:48 PM, Glen Shires <gshires@google.com> wrote: > >> Inserting whitespace is non-trivial, particularly when considering >> punctuation and internationalization. Some punctuation is placed before the >> whitespace, others after. Some languages don't use whitespace. I'd prefer >> to avoid placing this burden on the JavaScript author. Speech recognition >> engines already contain this logic. >> >> Conversely, stripping leading and trailing whitespace is trivial, as is >> writing a comparison routine that ignores whitespace. >> >> >> On Thu, Aug 30, 2012 at 1:35 PM, Young, Milan <Milan.Young@nuance.com>wrote: >> >>> I prefer Satish’s suggestion. If the web author needs to concatenate, >>> sandwiching in some whitespace seems like a trivial adjustment.**** >>> >>> ** ** >>> >>> ** ** >>> >>> *From:* Satish S [mailto:satish@google.com] >>> *Sent:* Thursday, August 30, 2012 1:28 PM >>> *To:* Glen Shires >>> *Cc:* public-speech-api@w3.org >>> *Subject:* Re: Concatenating transcript results**** >>> >>> ** ** >>> >>> We could also say the transcript should not include leading or trailing >>> spaces, so the web app should always use a whitespace if it needs to >>> concatenate. This would work better for apps that check the transcript >>> with known words (e.g. command and control) instead of having to >>> append/prepend whitespaces to their string literals. Also depending on the >>> language of the recognized text whitespace may not be appropriate (e.g. CJK >>> don't use white spaces).**** >>> >>> >>> Cheers >>> Satish >>> >>> **** >>> >>> On Thu, Aug 30, 2012 at 6:11 PM, Glen Shires <gshires@google.com> wrote: >>> **** >>> >>> If there's no disagreement by the end of the week I'll add it to the >>> spec...**** >>> >>> ** ** >>> >>> On Wed, Aug 29, 2012 at 9:36 AM, Glen Shires <gshires@google.com> wrote: >>> **** >>> >>> I propose adding the following sentence to the definition >>> of SpeechRecognitionAlternative.transcript to make it clear that a >>> JavaScript author can simply concatenate SpeechRecognitionResults without >>> the author having to worry about where/when to add whitespace.**** >>> >>> ** ** >>> >>> "For continuous recognition, whitespace MUST be included in the >>> transcript, including leading or trailing whitespace, as necessary such >>> that concatenation of consecutive SpeechRecognitionResults produces a >>> proper transcript of the session."**** >>> >>> ** ** >>> >>> ** ** >>> >> >> >
Received on Friday, 31 August 2012 00:52:21 UTC