Re: EME impact on accessibility from Alastair Campbell on 2017-04-05 (w3c-wai-gl@w3.org from April to June 2017)

From: Alastair Campbell <acampbell@nomensa.com>
Date: Wed, 5 Apr 2017 22:35:42 +0000
To: Judy Brewer <jbrewer@w3.org>, John Foliot <john.foliot@deque.com>
CC: WCAG <w3c-wai-gl@w3.org>, Jonathan Avila <jon.avila@ssbbartgroup.com>
Message-ID: <HE1PR0901MB14689C56DF5FB735433326B3B90A0@HE1PR0901MB1468.eurprd09.prod.outlook.>
Hi Judy,

Thanks for that, I look forward to reading the summary, I'm glad I held off on the AC vote.

John, 
Also a thank you for the detailed explanations, it seems my hunches were mostly correct but I've bookmarked that as I'm sure I'll refer to it again! 

You asked if I was aware of instances where people had been blocked from auto-generating captions. Not really, but that is a fairly new technology, and it is the start-up prevention aspect that would bother me.

e.g. A hypothetical plucky startup 'AutoCaptionX' offers a service to add captions to any video the user wants, via the browser. From what you've said, if it worked in-browser that might work with EME.

However, it is probably the legal side that would kill the service, if the providers Ts & Cs prevent it then they can be sued via the DMCA. 

And, veering into the politics a bit, that is frustrating. It is really a political and legal problem, so the EME spec (and therefore W3C) is being used as leverage, it is not the problem itself. 

I hope others read and consider the summary when thinking about EME's impact on accessibility.

Cheers,

-Alastair

________________________________________
From: Judy Brewer <jbrewer@w3.org>
Sent: 05 April 2017 17:30
To: John Foliot; Alastair Campbell
Cc: WCAG; Jonathan Avila
Subject: Re: EME impact on accessibility

Hi Alastair and All,

FYI, there has been review on these and other potential concerns about EME and accessibility in several threads over the months and years.

Team-side, we've been pulling these together into a summary that I hope will be available either later today, otherwise tomorrow.

A quick summary for now is that from a perspective of possible blocks to research on accessibility adaptations, for every case that we've looked at, research on these kinds of adaptations is best done in open video streams in any case, and such research is not blocked in any way by the existence, or not, of some form of encryption. And when it comes to application of such adaptations, in every case we're looked at so far, adaptations can be handled the same way that one ordinarily handles passing information back and forth by agreements. By the way for more traditional types of accessibility information (captions, descriptions, transcripts) in practice those are generally available in the clear in any case, regardless of any encryption, and that's also exactly what the spec calls for; and for the infrequent exceptions (for instance, open captions that are burned into a primary video object), those are decrypted along with the video.

I'm updating a summary document of this kind info and will bounce a copy of that to this list (among others) as soon as available, in hopes that people don't unnecessarily burn more ink on this. I'd continue to be interested in any questions or concerns once that updated summary is available -- and appreciate your keep us updated on any concerns.

Thanks,

- Judy

On 4/5/2017 12:10 PM, John Foliot wrote:
Hi Alastair,

That's all huh? Apologies for the following rather lengthy (and occasionally opinionated) response.

TL;DR:

Having had a ring-side seat and as an active participant in much of this debate, here's my biased perspective: despite ongoing claims that EME impacts accessibility, we've not seen any actual proof of that, despite, as you note, the fact that EME has been in browsers for a couple of years now - surely if there were issues concerning accessibility and EME we'd hear of them by now, right? Especially if we've been actively listening for that particular "ping from the cosmos" (as I have been).

And what I have heard is {{crickets}}

**********

Breaking down your questions with what *I* know (and I freely admit I likely don't know it all, but I've been very active here, so...)

>           Captions.

If captions are available they must be un-encrypted, so there shouldn’t be an issue there.

Pretty much. We have two specific scenarios, 1) support materials are provided "out-of-band", and 2) support materials are provided "in-band".

In-band:

What this means (in case you or anyone else reading this is unaware) is that file formats such as MP4 and MKV are actually "wrapper" formats, used to contain media content and related materials. Think of them as similar to CAB files, or even ZIP files, where the container travels across  the web as a complete and single entity, and then expanded or "opened" at the user-end.

So an MP4 file can contain an H.264 encoded video, AAC encoded audio, and, if desired during post-production, additional files (such as TTML/WebVTT files, and/or other manifest files, support files, etc.) can also be included in the "wrapper".

As part of the effort around HTML5's <video> element, there was also an API developed that allows browsers (user agents) to open and extract "track" files from the container wrapper (https://www.w3.org/TR/html5/embedded-content-0.html#audiotracklist-and-videotracklist-objects) although AFAICT it only has support in Apple products today (not surprisingly, as I recall the API was developed by Eric Carlson, an Apple engineer).

In this scenario, the EME spec states:

Unencrypted In-band Support Content

In-band support content, such as captions, described audio, and transcripts, should not be encrypted.

NOTE

Decryption of such tracks - especially such that they can be provided back the user agent - is not generally supported by implementations. Thus, encrypting such tracks would prevent them from being widely available for use with accessibility features in user agent implementations.

 After asking about this, my understanding is this: the "MP4" file can be encrypted and decrypted via the EME API, however all content inside the wrapper must be unecrypted. Since content "entering" a browser environment (user agent) will first decrypt the content "at the door", the opened or expanded wrapper container will actually provide content to the end user - unencrypted. In other words, once you have access to the content inside the wrapper file, none of that content is further blocked from user interaction. Failing to have legal access to the content inside of the MP4 means *all* content is blocked to the end user - disabled or otherwise. On the other hand, if you have a legal right to access the content inside the MP4 wrapper, you have access to *ALL* of it, including the support materials.

Out-of-band:

Now the other means of providing support materials is "out-of-band", which I think most folks conceptually understand, as this is the use-case for introducing the <track> child element of <video>.

In this scenario, support files are referenced via the <track> element, and those files are delivered to the user agent independently, in a fashion similar to how .jpg or other graphic files aren't "embedded" (OLE) in web pages, but rather are referenced by the code, and the referenced file travels over the net as a discrete file. In this scenario, the EME spec states:

Implementations that choose to support encrypted support content must provide the decrypted data to the user agent to be processed in the same way as equivalent unencrypted timed text tracks<https://www.w3.org/TR/html51/semantics-embedded-content.html#timed-text-tracks>.

...and so, by design, EME allows (demands?) for unencrypted text tracks in the user-agent (browser). In other words, all of the decryption happens before the content even renders in the browser, and once rendered in the browser, the end-user can interact with that content "unimpeded" (with the exception that the streamed content can only be "viewed" and not saved.) Remember, EME is for *streaming video* only, and cannot be re-purposed for other uses today (AFAIK):

"This proposal extends HTMLMediaElement<https://www.w3.org/TR/html51/semantics-embedded-content.html#htmlmediaelement-htmlmediaelement> [HTML51<https://www.w3.org/TR/encrypted-media/#bib-HTML51>] providing APIs to control playback of encrypted content.
" (
https://www.w3.org/TR/encrypted-media/

)

************

Continuing with your other questions/scenarios:


Audio description.
I assume audio-description would simply be a separate audio stream or separate video, I don’t see an issue there.


Essentially covered by the same logic and principles as
 captions/sub-titles, but only a different file format. Video description can be provided via a separate audio stream included inside the wrapper format or referenced using the <audio> element and "slaved" to the video; conversely 'text' descriptions (a relatively new possibility) which are then 'processed' by TTS engines can also be included inside the wrapper format (in-band) or referenced via the child <track> element of <video> (out-of-band). (This also holds true for Transcripts BTW...)

Audio descriptions can be provided, either as a separate track embedded in the video stream, or a separate audio track in an audio<https://www.w3.org/TR/html5/embedded-content-0.html#the-audio-element> element slaved<https://www.w3.org/TR/html5/embedded-content-0.html#slaved-media-elements> to the same controller as the video<https://www.w3.org/TR/html5/embedded-content-0.html#the-video-element> element(s), or in text form using a WebVTT file<https://www.w3.org/TR/html5/infrastructure.html#webvtt-file> referenced using the track<https://www.w3.org/TR/html5/embedded-content-0.html#the-track-element> element and synthesized into speech by the user agent
(https://www.w3.org/TR/html5/embedded-content-0.html#the-video-element)



Enlargement of content.
I’m not sure how this is affected. The video is encrypted, but I believe that its size can be adjusted within a page. Captions and the timings that drive them are not encrypted so should not be affected by EME.

Enlargement (vaguely defined here) would be a function of the browser, and enacted/provided post decryption by the browser.

EME is the "greeter" at the front door of the browser - once you clear EME (i.e. you are 'authenticated/cleared' to view the protected premium content), EME then "gets out of the way" and allows browsers to do what they do. (A very quick check with one browser and one source - Netflix in Chrome on Windows - confirms to me that I cannot "zoom" or enlarge the content on my screen - but then again, I can't do that on my TV either...). Bottom line: this appears to be a constraint of the browser, and not introduced nor impeded by EME (but I suspect more testing would be required there to categorically prove or disprove the assertion).


Auto captioning of the audio stream.
So encrypting the video & audio would (theoretically at least) prevent a 3rd party from running auto-captioning software on the audio.

However, the companies with the capability to do that (Youtube, Microsoft, Amazon etc) are very closely correlated with the companies applying the DRM. Would this be an issue in practice? Presumably the responsibility for providing captions is on the provider who has the non-encrypted copy, therefore they are not prevented from auto-captioning?

This is an interesting use-case. In principle, I suppose that "agents" (be they humans or APIs) not authorized to consume content will not be able to perform functions like this. However, as you note, this only means that the content creator is otherwise obligated to provide the captions to remain "lawful" w.r.t. providing accessibility support of video content.

EME was conceived primarily to protect "premium content" (i.e. commercially produced entertainment content), and while in theory it could be applied to *all* video content, there is a cost/benefit ratio involved that acts as a bit of a filter (access to the CDM - Content Decryption Module - is a licensed activity, and has a cost associated to it borne by the content owner). Additionally, while speech-to-text continues to improve at a near-daily pace, as accessibility professionals we know that the accuracy of this technology today is less then ideal.

Finally, by logic (but untested), it would seem that once you have satisfied the "right to consume" requirement that the DRM imposes (and is processed via the EME API), that all content is then "unencumbered" by the encryption when rendered in the browser, so in theory at least the audio could then be "listened to" and converted to text. Alastair, are you aware of any actual instances where this has proven to be an issue?



Facial recognition.
I’m not entirely sure what the purpose of this would be, identifying people/actors/characters as they come and go? I can tell Amazon already has that information as meta-data for their videos as the interface can show you who is in the scene. I suspect they add that with a more manual process though, as it doesn’t match whether the face is on screen or not, just whether they are in the scene.
Theoretically this would prevent 3rd party access to facial recognition, but is it something that would be the responsibility of the provider anyway? Not sure.

Again... to me, this just seems to be grasping at straws.

I suspect you'd have to construct a very complex use-case to show how this specifically and explicitly was an "accessibility issue". I'd be happy to hear that tap-dance however, but I cannot envision one myself. Does anyone here know of a software tool or accessibility requirement that is dependent on facial recognition? (Frankly, my opinion is that anti-EME proponents will throw anything and everything against the wall because they just fundamentally disagree with the premise which spawned EME in the first place, which is: Premium content owners are permitted by law to restrict and control access to digital files they have invested in and created, as part of a for-profit enterprise. I'm very much a Free as in speech, but not as in beer kind of guy).


Color filtering.
On iOS (at least) colour filtering can be done at the hardware level, and if you have colour issues then presumably you’d want it on all the time, not than just videos?

(Also referenced as "Daltonization" by Corey Doctorow and others).

At first, this seemed to be a potential "Did we miss this?" question at the APA WG (and among the participants of the Media Accessibility Task Force who created the MAUR). As a well known "anti-Apple" kind of guy, I cannot speak to the iOS mechanism, but I did do some testing last summer around this concern. As you note, PwD who require specialized color palettes to meet their visual impairments, will likely require this for *all* content consumed, and not just Premium video content.

I knew that ZoomText allowed for user-specified color palettes in the browser, and so I again went to Netflix, launched a video, and then "applied" a customized color palette via ZoomText. Sure enough, it "worked" (were worked = the visual interface was modified by the software to provide the 'required' or specified color modifications). These changes were applied to both the "chrome" (user controls) as well as the content rendered in the view-port of the video player - even when I went "full screen". (I am unsure of *how* ZoomText achieves this, but it appeared that an overlay filter of sorts was invoked, as when I attempted to do a screen capture, the capture "lost" the colorization - I had to take a photo of the screen with the colorization as "proof")

And so, based upon the following user-story/requirement ("As a person with visual impairments, I need to be able to modify the color palette of content in my browser window to those that meet my needs"), I was able to demonstrate that I was able to meet that requirement. Whether or not this is the same with the "hardware" solution provided by iOS I am unsure, but at this time I would chalk that up to an issue with the user-agent, and not because of EME per-se (because I was able to successfully address the user-story/requirement using software on my rig).

Do we need more testing and investigation here? Likely, and there is an effort inside of the W3C to continue to do this type of testing, and gathering of data. (If you are interested in being involved in that effort, ping me and all help gratefully accepted.)

**********

<rant>
I am fed-up, up-to-here, with anti-EME proponents playing the scary "accessibility" card for political gain, without spending the time or effort supporting their claims.

They are relying not on logic or evidence, but rather on non-accessibility-experts 'fear' that they may run afoul of the law with regard to digital content. W3C protocol 'forbids' me from casting aspersions on specific fellow W3C colleagues, but it is my personal opinion that many of the more vocal EME opponents really don't care that much about PwD's needs on the web, but rather simply see that this is but an easy and simple means of casting doubt and confusion around EME, because they don't like the politics of it. We then see others echo "accessibility concerns" without specifics in their responses as a reason to not advance the EME API Spec at the W3C.

That angers me to no end!   It trivializes and politicizes the real issues and problems PwD experience on the web today - without once providing evidence that EME has a negative impact on those people. It "sounds" bad, ergo it must be bad.

Bull feathers!!!
</rant>

JF

On Wed, Apr 5, 2017 at 8:22 AM, Alastair Campbell <acampbell@nomensa.com<mailto:acampbell@nomensa.com>> wrote:
Hi everyone,

I’m trying to get some information to make a choice without getting into a bun-fight on a contentious topic. I’d like to get to the facts of the situation without talking about the good/bad of EME in general, so please bare that in mind.

Background:
The W3C has “Encrypted Media Extensions” [1] at Proposed Recommendation stage, the spec that defines the API from the browser to a DRM module. Several W3C members are objecting to it on the grounds of the impact is has on security and accessibility.

Questions:
What I’d like to focus on is the theoretical and practical implications for accessibility. For example, from my reading:


-          Captions.
If captions are available they must be un-encrypted, so there shouldn’t be an issue there.


-          Audio description.
I assume audio-description would simply be a separate audio stream or separate video, I don’t see an issue there.

Other items raised by people to do with accessibility are as follows, with my own comments under the item:


-          Enlargement of content.
I’m not sure how this is affected. The video is encrypted, but I believe that its size can be adjusted within a page. Captions and the timings that drive them are not encrypted so should not be affected by EME.


-          Auto captioning of the audio stream.
So encrypting the video & audio would (theoretically at least) prevent a 3rd party from running auto-captioning software on the audio.

However, the companies with the capability to do that (Youtube, Microsoft, Amazon etc) are very closely correlated with the companies applying the DRM. Would this be an issue in practice? Presumably the responsibility for providing captions is on the provider who has the non-encrypted copy, therefore they are not prevented from auto-captioning?


-          Facial recognition.
I’m not entirely sure what the purpose of this would be, identifying people/actors/characters as they come and go? I can tell Amazon already has that information as meta-data for their videos as the interface can show you who is in the scene. I suspect they add that with a more manual process though, as it doesn’t match whether the face is on screen or not, just whether they are in the scene.
Theoretically this would prevent 3rd party access to facial recognition, but is it something that would be the responsibility of the provider anyway? Not sure.


-          Color filtering.
On iOS (at least) colour filtering can be done at the hardware level, and if you have colour issues then presumably you’d want it on all the time, not than just videos?

Given that EME has been implemented in browsers for several years, the question is whether the W3C blesses the spec, and I’d like some solid information on the accessibility aspects before commenting.

Kind regards,

-Alastair

1] https://www.w3.org/TR/2017/PR-encrypted-media-20170316/





--
John Foliot
Principal Accessibility Strategist
Deque Systems Inc.
john.foliot@deque.com<mailto:john.foliot@deque.com>

Advancing the mission of digital accessibility and inclusion
Received on Wednesday, 5 April 2017 22:36:20 UTC