Re: [w3ctag/design-reviews] Incubation: Web Speech API: On-Device Recognition Quality (Issue #1189) from Evan Liu on 2026-04-09 (public-webapps-github@w3.org from April 2026)

From: Evan Liu <notifications@github.com>
Date: Thu, 09 Apr 2026 09:26:12 -0700
To: w3ctag/design-reviews <design-reviews@noreply.github.com>
Cc: Subscribed <subscribed@noreply.github.com>
Message-ID: <w3ctag/design-reviews/issues/1189/4215795354@github.com>

evanbliu left a comment (w3ctag/design-reviews#1189)

Does anyone have any other comments on this issue? I've included responses to the security & privacy questionnaire below:

2.1. What information does this feature expose, and for what purposes?
Exposure: The feature exposes the availability of specific on-device speech recognition capabilities (categorized as 'command', 'dictation', or 'conversation') for a given language.
Purpose: This exposure allows web developers to specify the semantic capability required for local, on-device speech recognition (when processLocally: true is utilized). This helps optimize the underlying engine's performance, accuracy, and power consumption based on the specific task the user is performing.

2.2. Do features in your specification expose the minimum amount of information necessary to implement the intended functionality?
Yes. The proposal restricts the exposure to a simple, predefined enum of three distinct quality levels. It does not expose granular details about the user's specific hardware, the exact machine learning models installed, or the underlying operating system's native speech APIs.

2.6. Do the features in your specification expose information about the underlying platform to origins?
This API does not introduce on-device speech recognition itself (which is already part of the existing spec). The fingerprinting concerns associated with model availability are addressed via mitigations detailed in [WebAudio/web-speech-api#165](https://github.com/WebAudio/web-speech-api/pull/165). These countermeasures are modeled after the [Writing Assistance APIs](https://webmachinelearning.github.io/writing-assistance-apis/), which typically mitigate this by downloading models on demand (rather than revealing pre-installed state) or by standardizing the availability of core models to reduce entropy.

2.7. Does this specification allow an origin to send data to the underlying platform?
Yes. The proposal allows an origin to pass a specific quality constraint through the browser to the underlying platform's local speech recognition engine to configure how the audio stream is processed.

2.8. Do features in this specification enable access to device sensors?
Yes (Inherited). While this specific proposal only adds an options property, the underlying Web Speech API intrinsically requires access to the device's microphone. This proposal relies entirely on the existing permissions model, user prompts, and security indicators currently established for microphone access in the browser. It does not introduce new sensor access mechanisms.

2.13. How does this specification distinguish between behavior in first-party and third-party contexts?
The proposal does not explicitly introduce new behaviors for third-party contexts. However, like the broader Web Speech API, microphone access (and therefore the ability to use this feature) should be governed by Permissions Policy. Third-party iframes would require explicit delegation (e.g., allow="microphone") from the first-party context to utilize speech recognition at any quality level.

2.14. How do the features in this specification work in the context of a browser’s Private Browsing or Incognito mode?
The API should function similarly to standard browsing, provided the user grants microphone permissions. However, to prevent cross-session tracking, browsers may need to apply stricter model-download heuristics in Private Browsing. For example, if a specific 'dictation' model for a rare language is downloaded during an Incognito session, the browser must ensure that the availability of this newly cached model is not exposed to subsequent standard browsing sessions, and vice versa, to prevent linking the two profiles.

--
Reply to this email directly or view it on GitHub:
https://github.com/w3ctag/design-reviews/issues/1189#issuecomment-4215795354
You are receiving this because you are subscribed to this thread.

Message ID: <w3ctag/design-reviews/issues/1189/4215795354@github.com>

Received on Thursday, 9 April 2026 16:26:16 UTC