RE: Speech Synthesis - Length parameter from Jerry Smith (WPT) on 2016-11-14 (public-speech-api@w3.org from November 2016)

From: Jerry Smith (WPT) <jdsmith@microsoft.com>
Date: Mon, 14 Nov 2016 20:59:34 +0000
To: Glen Shires <gshires@google.com>
CC: Eitan Isaacson <eisaacson@mozilla.com>, Dominic Mazzoni <dmazzoni@google.com>, "public-speech-api@w3.org" <public-speech-api@w3.org>
Message-ID: <SN2PR03MB0470A925596DE5FADD5A73FA4BC0@SN2PR03MB047.namprd03.prod.outlook.com>
LGTM

-----Original Message-----
From: Glen Shires [mailto:gshires@google.com] 
Sent: Monday, November 14, 2016 12:41 PM
To: Jerry Smith (WPT) <jdsmith@microsoft.com>
Cc: Eitan Isaacson <eisaacson@mozilla.com>; Dominic Mazzoni <dmazzoni@google.com>; public-speech-api@w3.org
Subject: Re: Speech Synthesis - Length parameter

Here's a re-wording of Jerry's proposed errata in an attempt to add a bit more clarity.  I welcome your feedback:



Section 5.2 IDL: Add to SpeechSynthesisEvent:

        readonly attribute unsigned long charLength;

Section 5.2.5 SpeechSynthesisEvent Attributes: Add:

charLength attribute
This attribute indicates the length of the text (word or sentence) that will be spoken corresponding to this event. This attribute is the length, in characters, starting from this event's charIndex.  The user agent must return this value if the speech synthesis engine supports it or the user agent can otherwise determine it, otherwise the user agent must return undefined.

On Fri, Nov 11, 2016 at 1:41 PM, Jerry Smith (WPT) <jdsmith@microsoft.com> wrote:
> That’s good to hear!
>
>
>
> I’m thinking this is the change:
>
>
>
> Add charLength to SpeechSynthesisEvent Attributes:
>
> 5.2.5 SpeechSynthesisEvent Attributes
>
> charLength attribute
>
> This attribute indicates the length of the text word or sentence, in 
> characters, starting from the current charIndex in the audio playback.  
> The user agent must return this value if the speech synthesis engine 
> supports it or the user agent can otherwise determine it, otherwise 
> the user agent must return undefined.
>
>
>
> Jerry
>
>
>
> From: Eitan Isaacson [mailto:eisaacson@mozilla.com]
> Sent: Friday, November 11, 2016 1:05 PM
> To: Glen Shires <gshires@google.com>
> Cc: Dominic Mazzoni <dmazzoni@google.com>; Jerry Smith (WPT) 
> <jdsmith@microsoft.com>; public-speech-api@w3.org
> Subject: Re: Speech Synthesis - Length parameter
>
>
>
> Jerry, you beat me to it. I am willing to implement this in Firefox.
>
>
>
> On Fri, Nov 11, 2016 at 12:16 PM, Glen Shires <gshires@google.com> wrote:
>
> Yes, this is the proper place to discuss potential changes / errata to [1].
>
> Thank you for the proposal. We welcome others to comment on it on this 
> mailing list.
>
> The next step would be for someone to propose specific wording for an 
> errata item in the format of [2].
>
> Then after allowing several weeks for all to review / comment, if 
> there's agreement, we can add it to the errata [2] and to the draft 
> with errata [1]
>
> [1] https://dvcs.w3.org/hg/speech-api/raw-file/tip/webspeechapi.html

> [2] 
> https://dvcs.w3.org/hg/speech-api/raw-file/tip/speechapi-errata.html

>
>
>
> On Fri, Nov 11, 2016 at 8:23 AM, Dominic Mazzoni <dmazzoni@google.com>
> wrote:
>> I support 'length' in SpeechSynthesisEvent and I'd be willing to 
>> implement it in Chrome if there are no objections.
>>
>> On a larger note, it'd be great if we could revive discussions on 
>> this list and update the spec based on the errata.
>>
>> - Dominic
>>
>>
>> On Thu, Nov 10, 2016 at 3:59 PM Jerry Smith (WPT) 
>> <jdsmith@microsoft.com>
>> wrote:
>>>
>>> We’ve implemented speech synthesis in Edge on the Windows 10 
>>> Anniversary Update, and have been revising it lately to support the 
>>> word boundary features.  We have an internal partner that wants to 
>>> use these.  They’ve also requested we support word “length”, which 
>>> isn’t included in the community group Web Speech API Specification.  
>>> Knowing the length in addition to boundary makes it very simple to 
>>> highlight text while it is being spoken.  We already support this in 
>>> WinRT APIs, and would like to do the same on Edge.
>>>
>>> Our goal would be to receive equivalents to the following WinRT API 
>>> events for both paragraphs and words:
>>>
>>>
>>>
>>> TimeSpan              StartTime       Position in the audio stream
>>>
>>>                                       Required by IMediaCue
>>>
>>> HSTRING               Text            Text of the bookmark.  For sentence
>>> and word boundary this can provide the text snippet from the 
>>> original text.
>>>
>>>                                       Note: We do not have a strong 
>>> requirement to support text for word and sentence boundary markers.
>>>
>>> Nullable<UINT32>      Offset          Offset in the input text associated
>>> with the current position in the audio playback.
>>>
>>>                                       This is not populated for SSML 
>>> bookmarks.
>>>
>>> Nullable<UINT32>      Length          The length of the text starting
>>> from
>>> the Offset associated with the position in the audio playback.
>>>
>>>                                       This is not populated for SSML 
>>> bookmarks.
>>>
>>>
>>>
>>> The existing speech API spec has been around for a while.  Is there 
>>> a way to evaluate and process spec additions/edits?
>>>
>>>
>>>
>>> Glen:  I’d appreciate hearing your take on this suggestion.  The 
>>> Speech API community report dates to 2012.  Is there much interest 
>>> in revising it in other ways?
>>>
>>>
>>>
>>> Jerry Smith
>>>
>>> Microsoft – Web Platform Team
>
>
Received on Monday, 14 November 2016 21:00:12 UTC