Re: Speech Synthesis - Length parameter from Glen Shires on 2016-11-14 (public-speech-api@w3.org from November 2016)

From: Glen Shires <gshires@google.com>
Date: Mon, 14 Nov 2016 12:40:36 -0800
To: "Jerry Smith (WPT)" <jdsmith@microsoft.com>
Cc: Eitan Isaacson <eisaacson@mozilla.com>, Dominic Mazzoni <dmazzoni@google.com>, "public-speech-api@w3.org" <public-speech-api@w3.org>
Message-ID: <CAEE5bchUi0Pg_Lmh1K2jBg3XiAgVTOqRGk5HzXnQdmfh25vwTQ@mail.gmail.com>
Here's a re-wording of Jerry's proposed errata in an attempt to
add a bit more clarity.  I welcome your feedback:



Section 5.2 IDL: Add to SpeechSynthesisEvent:

        readonly attribute unsigned long charLength;

Section 5.2.5 SpeechSynthesisEvent Attributes: Add:

charLength attribute
This attribute indicates the length of the text (word or sentence)
that will be spoken corresponding to this event. This attribute is the
length, in characters, starting from this event's charIndex.  The user
agent must return this value if the speech synthesis engine supports
it or the user agent can otherwise determine it, otherwise the user
agent must return undefined.

On Fri, Nov 11, 2016 at 1:41 PM, Jerry Smith (WPT)
<jdsmith@microsoft.com> wrote:
> That’s good to hear!
>
>
>
> I’m thinking this is the change:
>
>
>
> Add charLength to SpeechSynthesisEvent Attributes:
>
> 5.2.5 SpeechSynthesisEvent Attributes
>
> charLength attribute
>
> This attribute indicates the length of the text word or sentence, in
> characters, starting from the current charIndex in the audio playback.  The
> user agent must return this value if the speech synthesis engine supports it
> or the user agent can otherwise determine it, otherwise the user agent must
> return undefined.
>
>
>
> Jerry
>
>
>
> From: Eitan Isaacson [mailto:eisaacson@mozilla.com]
> Sent: Friday, November 11, 2016 1:05 PM
> To: Glen Shires <gshires@google.com>
> Cc: Dominic Mazzoni <dmazzoni@google.com>; Jerry Smith (WPT)
> <jdsmith@microsoft.com>; public-speech-api@w3.org
> Subject: Re: Speech Synthesis - Length parameter
>
>
>
> Jerry, you beat me to it. I am willing to implement this in Firefox.
>
>
>
> On Fri, Nov 11, 2016 at 12:16 PM, Glen Shires <gshires@google.com> wrote:
>
> Yes, this is the proper place to discuss potential changes / errata to [1].
>
> Thank you for the proposal. We welcome others to comment on it on this
> mailing list.
>
> The next step would be for someone to propose specific wording for an
> errata item in the format of [2].
>
> Then after allowing several weeks for all to review / comment, if
> there's agreement, we can add it to the errata [2] and to the draft
> with errata [1]
>
> [1] https://dvcs.w3.org/hg/speech-api/raw-file/tip/webspeechapi.html
> [2] https://dvcs.w3.org/hg/speech-api/raw-file/tip/speechapi-errata.html
>
>
>
> On Fri, Nov 11, 2016 at 8:23 AM, Dominic Mazzoni <dmazzoni@google.com>
> wrote:
>> I support 'length' in SpeechSynthesisEvent and I'd be willing to implement
>> it in Chrome if there are no objections.
>>
>> On a larger note, it'd be great if we could revive discussions on this
>> list
>> and update the spec based on the errata.
>>
>> - Dominic
>>
>>
>> On Thu, Nov 10, 2016 at 3:59 PM Jerry Smith (WPT) <jdsmith@microsoft.com>
>> wrote:
>>>
>>> We’ve implemented speech synthesis in Edge on the Windows 10 Anniversary
>>> Update, and have been revising it lately to support the word boundary
>>> features.  We have an internal partner that wants to use these.  They’ve
>>> also requested we support word “length”, which isn’t included in the
>>> community group Web Speech API Specification.  Knowing the length in
>>> addition to boundary makes it very simple to highlight text while it is
>>> being spoken.  We already support this in WinRT APIs, and would like to
>>> do
>>> the same on Edge.
>>>
>>> Our goal would be to receive equivalents to the following WinRT API
>>> events
>>> for both paragraphs and words:
>>>
>>>
>>>
>>> TimeSpan              StartTime       Position in the audio stream
>>>
>>>                                       Required by IMediaCue
>>>
>>> HSTRING               Text            Text of the bookmark.  For sentence
>>> and word boundary this can provide the text snippet from the original
>>> text.
>>>
>>>                                       Note: We do not have a strong
>>> requirement to support text for word and sentence boundary markers.
>>>
>>> Nullable<UINT32>      Offset          Offset in the input text associated
>>> with the current position in the audio playback.
>>>
>>>                                       This is not populated for SSML
>>> bookmarks.
>>>
>>> Nullable<UINT32>      Length          The length of the text starting
>>> from
>>> the Offset associated with the position in the audio playback.
>>>
>>>                                       This is not populated for SSML
>>> bookmarks.
>>>
>>>
>>>
>>> The existing speech API spec has been around for a while.  Is there a way
>>> to evaluate and process spec additions/edits?
>>>
>>>
>>>
>>> Glen:  I’d appreciate hearing your take on this suggestion.  The Speech
>>> API community report dates to 2012.  Is there much interest in revising
>>> it
>>> in other ways?
>>>
>>>
>>>
>>> Jerry Smith
>>>
>>> Microsoft – Web Platform Team
>
>
Received on Monday, 14 November 2016 20:42:54 UTC