FW: Comments on w3ctag/eme/ from Paul Cotton on 2014-02-25 (public-html-media@w3.org from February 2014)

From: Paul Cotton <Paul.Cotton@microsoft.com>
Date: Tue, 25 Feb 2014 15:33:40 +0000
To: "public-html-media@w3.org" <public-html-media@w3.org>
CC: "Jeni Tennison (jeni@jenitennison.com)" <jeni@jenitennison.com>
Message-ID: <2fabebcd11d8405b867ebc1d7fa4b35c@BL2PR03MB418.namprd03.prod.outlook.com>
The W3C TAG is discussing EME via a DRAFT position paper on "Proprietary Extensions to the Web":
https://github.com/w3ctag/eme/blob/master/README.md


" Over the web's 25 years there have been several technologies and architectures which have had the effect of restricting access for some people to portions of the web. This document explores how these work and the effect they had on the web, with the ultimate goal of aiming to inform the debate about the inclusion of Encrypted Media Extensions (EME) in HTML."

See also the thread starting with Henri Sivonen 's comments (below) and at:
http://lists.w3.org/Archives/Public/www-tag/2014Feb/0057.html  and
Jeni Tennison's reply at:
http://lists.w3.org/Archives/Public/www-tag/2014Feb/0074.html 

I suggest that Media TF members make any substantive comments on the DRAFT position paper on the TAG list rather than in a Reply to this message.

/paulc

Paul Cotton, Microsoft Canada
17 Eleanor Drive, Ottawa, Ontario K2E 6A3
Tel: (425) 705-9596 Fax: (425) 936-7329

-----Original Message-----
From: Henri Sivonen [mailto:hsivonen@hsivonen.fi] 
Sent: Wednesday, February 19, 2014 8:50 AM
To: www-tag
Subject: Comments on w3ctag/eme/

Comments on the "initial opinion" version of https://github.com/w3ctag/eme/blob/master/README.md ; quotes from that
document:

(The intro looks OK.)

> Content that is accessed through a plugin is also rarely accessible to 
> other technologies in use on the web. It cannot be linked to from 
> other pages (reducing the number of links that could be made, and thus 
> reducing the value of the web).

This depends on whether the entire site navigation is done within the plug-in. It's probably more common for a plug-in to appear on each linkable page as on YouTube than for the entire site navigation to be done within the plug-in. On the other hand, it's quite possible to put the entire site navigation inside a plug-in-free Open Web Platform program.

Therefore, I don't think non-linkability is a major plug-in-specific disadvantage.

> One basic principle for web technologies that enables the web to grow 
> and be valuable and prevents it from fragmenting is platform 
> independence

Indeed.

> The Encrypted Media Extensions defined in the Encrypted Media 
> Extensions Working Draft aim to provide a common, platform independent 
> interface to enable sharing of encrypted content on the web.

I think this characterization is shameful.

As I have said earlier on this list, it is technically correct that EME involves "encrypted" content,  but talking about encryption evokes the wrong connotations about who the adversary is. (Typically, encryption is used against a third party on the network. In the case of EME, the user is the adversary against whom encryption is used.) Describing the "aim" of EME without saying "DRM" up front is grossly misleading.

Furthermore, saying that the "aim" of EME is to "enable sharing of encrypted content" is Newspeakish, when the stated goal of DRM is to
*disable* sharing of content.

> So when considering encrypted media on the web, we have to ensure that 
> the "black box" of encryption technologies is platform independent, as 
> well as the interface that surrounds it.

Calling DRM implementations "encryption technologies" amounts to weasel words.

> The EME specification must ensure platform independence, for example 
> by specifying the use of open standard encryption algorithms.

This statement is naïve in a way similar to diagnosing the interoperability problems with .doc files to stem from insufficient use of XML when .docx retains the same higher-level complexity.

While EME leaves the encryption part open-ended, both the MP4 CENC and WebM encryption methods non-normatively (because the W3C doesn't in general reference video formats normatively) referenced by EME are based on 128-bit AES-CTR, which is publicly documented and standardized.

Open standard encryption *algorithms* don't result in platform independence, when the encryption *keys* are withheld from you based on policies that discriminate by platform.

The design of EME is primarily informed by PlayReady. Other DRMs are free to adapt to PlayReady's shape to become EME-compatible, but it's instructive to look at PlayReady, since EME is made for PlayReady-shaped DRMs. PlayReady is technically platform-independent, but the policy level paints a different picture. Take a look at
http://go.microsoft.com/fwlink/?LinkID=122671 :
Section 2.1 says that the key parts of the license don't apply to "PC Software", which is defined in Section 1.5 to include, for example, software running on Windows, OS X, desktop Linux and Solaris. Now, there's a different SDK for Windows with different terms, but tough luck for non-Windows PC Software (except Microsoft's own Silverlight for Mac).

And that's ignoring royalties and such.

I don't know what crypto algorithms PlayReady uses apart from AES-CTR, but the public documentation strongly suggests that a PKI is involved.
It would be surprising if such a PKI wasn't based on open-standard crypto algorithms such as RSA. But even if you implemented all the right algorithms, your implementation wouldn't actually have utility unless you got your keys signed such that they chain to the root of trust of the PKI. That is, policy is enforced by refusing to sign your keys—not by algorithms being non-standard.

> DRM systems usually prohibit any manipulations with content including 
> displaying third-party subtitles,

As far as I can tell, nothing in EME itself or in shipped implementations prevents compositing HTML/CSS content over a video than uses EME-involved DRM. In fact, if you use Netflix with IE11 on Windows 8.1, you'll find that the service *relies* on being able to composite HTML/CSS content over EME-involved video frames.

> or (in case of e-books) reading a book using system voice-over engine, 
> thus making a content less accessible by disabled people.

True, but how is this relevant to EME as long as browsers don't provide client-side speech recognition or computer vision-based description for DRMless videos?

> In the case of EME, according to the layering principle, it should be 
> possible to specify a set of primitive APIs that would enable a DRM 
> system to be built in Javascript. Many of the pieces that would be 
> needed for this are already in progress, such as:
>
> * Streams API
> * Crypto API (with access to hardware crypto and tokens)
> * WebAudio API
> * Canvas (with tainting) for video output

The discussion of layering misses the point on at least three counts:
 1) EME integrates into the HTML5 video stack better than the APIs mentioned.
 2) The more layers a CDM subsumes, the better from the studio perspective.
 3) The royalty economics don't favor JS-based solutions from the site perspective.

Let's look at each one in turn:

1) The Open Web Platform does not actually provide API surface for a JavaScript program to receive elementary stream data (unencrypted or
encrypted) from the video stack and to return video frames as ArrayBuffers or WebGL textures to the video stack. Yet, an EME CDM gets to plug into the media stack on this sort of level. The point of EME is to reuse  the HTML5 media stack to the maximum extent DRM Robustness & Compliance rules allow. On the other hand, if you want to implement codecs that output to Canvas or the Web Audio API, you have to also re-implement the rest of the media stack in JS. In this sense, EME is better at layering than what the draft opinion puts forward.

2) If EME was truly about "encryption technologies" and not about DRM, to layer things properly assuming CENC files, you'd just stick the "[de]cryption technology" between the MP4 demultiplexer and the code that deals with unencrypted codec data in the existing media stack and you'd be done. However, in such a properly layered design the "[de]cryption technology"  would only have a chance to hide the decryption key from the rest of the code. The decrypted elementary stream data would be exposed to the User Agent, and in the DRM threat model, the user is the adversary and the copyright holders tend to be interested in hiding more than just the key.

Please take a look at the bulleted list at https://hsivonen.fi/eme/ .
Having the CDM merely decrypt is just a theoretical baseline. In practice, a marketable CDM needs to also do decoding inside the black box, which already conflates two layers. But that exposes the pixels to the User (i.e. Adversary) Agent. To hide those and thereby get access to a broader range of content, you need to conflate the layers even more. Therefore, clean layering and DRM are at odds.

3) Now, what if the Open Web Platform had an API to enabled Worker-backed codecs and Worker-backed CDMs? A JavaScript program would run in a worker and the browser would send a piece of unencrypted (in the mere codec case) or encrypted (in the CDM case) elementary stream data to the Worker and the worker would put YUV frame data in an ArrayBuffer at some offset and tell the browser which ArrayBuffer and what offset. This way, the ArrayBuffer could be an asm.js heap. Or, if WebGL in Workers happens, the Worker could put the frame data in a WebGL texture.

Awesome, right? Now browsers wouldn't have to provide DRM. A site could provide DRM as an asm.js program and, in principle, make it as obscure as any native code disassembly.

Well, three problems:

 a) You'd only get to conflate the decryption and decoding layers. You couldn't compete with the kind of device-resident DRM that hides the pixels from the browser.

 b) You couldn't leverage hardware H.264 decoders, so the solution would be less competitive on battery life (at least) than device-resident CDMs.

 c) The site would be liable for third-party royalties that whoever distributes the CDM to end users would be subject to, starting with
H.264 decoder royalties in a world where sites have a substantial investment in using H.264 and device-resident CDMs use H.264. In the case of device-resident CDMs, royalty liabilities arising from CDM distribution are not the site's problem.

The mention of the possibility of an asm.js program being able to be obscure leads to my last point:

> such as a secure JS worker that would only have access to a narrowly 
> limited set of APIs, and would run in a special context that could not 
> be inspected by the user

This is a terrible idea compared to EME. Please take this out of the opinion.

If the browser is responsible for making the program running in the Worker non-inspectable, the "robutness" (i.e.  the capacity of the solution to resist attempts by the end-user to read or write the data inside the DRM box) of the solution hinges on the browser and the browser itself falls into the DRM realm. This is worse than EME, which maintains a separation between the User Agent (trusted by the user) and the CDM (trusted by studios).

Since you seem to be worried about the browser becoming subject to anti-circumvention law, such as the DMCA, you shouldn't want the browser to fall into the DRM realm. That is, you shouldn't want the robustness of the solution to hinge on the browser.

If you want a Worker-based solution, you should suggest the Worker to be explicitly an inspectable white box and leave it to the site-supplied JS program running in the provided environment to be sufficiently obscure to defy inspection.

--
Henri Sivonen
hsivonen@hsivonen.fi
https://hsivonen.fi/
Received on Tuesday, 25 February 2014 15:34:10 UTC