Review of MediaStreamTrack Content Hints

Hi all,

I was asked to review the MediaStreamTrack Content Hints draft specification,
at https://www.w3.org/TR/mst-content-hint/.

This is an API specification that was created with a focus on the
end-users' experience: "Adding a media-content hint provides a way for a
web application to help track consumers make more informed decision[s]...."
The specification did not seem to be aware that these consumers may be
relying on AT.

Below is my initial response.

*The content hint attributes defined in the specification will benefit
consumers who rely on assistive technology (AT) and personalization. *Content
authors may author content hints with AT in mind. In addition, User Agents
make an accessibility tree available to assistive technology via API.  In
addition, User Agents are personalized to the needs of the user. These
personalizations will continue to be extended in future, for example to
support COGA, Personalization and other WG requirements. We expect a
proliferation of such personalizations in future. Such personalizations
will find content hints very relevant.

In sum, content authors can author contentHint
<https://t.sidekickopen90.com/s3t/c/5/f18dQhb0S7kF8cFFTBW4T_qld2zGCwVN8Jbw_8QsRtKVn1vXj1p1kknW16gGBN41Jd6G101?te=W3R5hFj4cm2zwW4mKLS-4mbkbhW49Ldrl308ybGW4fdgvc41YylgW4fdgXQ41YszVW3H90C_3_SMDQW3zh2Fq3K1LvHW49HR8w1Gy-qYW4fGC1K3R0JW00&si=8000000004174048&pi=58151f60-5af3-4f61-ebc9-364d322a7e5a>
 with the experience of AT users in mind, or UAs acting on behalf of
users.  The specification's introduction would be a good place to
clarify this as a further benefit of content hints.

*The specification make no mention of hints regarding support files *(captions,
audio descriptions) that often accompany media content, either linked to it
in HTML externally (using the <track> element) or furnished 'in-band',
e.g., contained within the .MP4 wrapper (HasCaptions: T/F,
HasAudioDescription: T/F). If either return True, THEN they need to be
exposed in the UI: essentially as 'active' buttons in the Controls. Did the
WG consider whether hints could also usefully convey whether the media
content has such supporting files?

Now we turn to some detailed feedback:

*Regarding Section 4. The specification does not address audio and video
formats that are often encountered with content that has been made
accessible. *For clarity, I propose hints accordingly:

*For Audio, an additional hint to indicate the presence of
audio-description *(or some similar label as you find appropriate).
Audio-description is audio that resembles speech-recognition, but does not
contain data for the purpose of speech recognition by a machine.
Audio-description is audio that resembles "speech" but it will likely not
be appropriate to apply noise suppression or boost intelligibility of the
incoming signal.

In the language of the specification (4.1) , "A track with content
hint "audio-description"
should be treated as if it contains audio data, without background noise,
describing in words the activity in the video."


*For Video, an additional hint to indicate the presence of transcription
embedded in the video*, e.g., motion-with-transcription (or some similar
label as you find appropriate). motion-with-transcription would refer to a
motion video that has, embedded, transcription data, either a
picture-in-picture showing a sign language interpreter, or text captions
embedded in the video.

In the language of the specification (4.2): A content hint of
motion-with-transcription should be treated such that one region of the
video frame has details that are extra important, and in that region that
significant sharp edges and areas of consistent color can occur frequently
(the area with sign language interpretation, or the area with onscreen
captioned text). This screen region would optimize for detail in the
resulting individual frames rather than smooth playback. Artefacts from
quantization or downscaling should be avoided.


Regarding section 5. In 5.2 "Degradation preference when encoding," the
specification addresses the choices encoders make, but no mention is made
of regions. For example in AVC, "it is also possible to create truly
lossless-coded regions within lossy-coded pictures." Picture regions may be
very significant for accessibility. Consider a video with sign language
interpretation embedded (e.g., in the upper right corner), or a video with
captions embedded (e.g., in the bottom of the picture area). These regions
would benefit from different encoding decisions than the rest of the frame. *We
would find it useful and supportive of accessible content to make this
information available as an RTCDegradationPreference.*

Lastly, considering the User Agent and the DOM: Should this specification
include an entreaty to the browser manufacturers to make the hint available
in the DOM and API so as to pass on this content-hint to the accessibility
tree?  If so, I would recommend a section something like, "*5.5 Behavior of
the User Agent: The user agent MUST make the contentHint available in the
DOM and the accessibility tree, to assure the contentHint is available to
assistive technologies."*

P.S. they should fix these two typos:


   - Abstract: change "make more informed decision" to either "make a more
   informed decision" or "make more informed decisions"
   - Section 2. change "they appear" to "it appears"


*Thanks to John F for his help and support. Some of the wording regarding
support files is his. *

Looking forward to today's meeting,

Lionel






Lionel Wolberger
COO, UserWay Inc.
lionel@userway.org
UserWay.org <http://userway.org/>
<https://t.sidekickopen90.com/s3t/c/5/f18dQhb0S7kF8cFFTBW4T_qld2zGCwVN8Jbw_8QsRtKVn1vXj1p1kknW16gGBN41Jd6G101?te=W3R5hFj4cm2zwW4hLZp04myBBCf43Wg2w04&si=8000000004174048&pi=58151f60-5af3-4f61-ebc9-364d322a7e5a>[image:
text]

Received on Wednesday, 2 June 2021 09:04:04 UTC