Comments on w3ctag/eme/ from Henri Sivonen on 2014-02-19 (www-tag@w3.org from February 2014)

From: Henri Sivonen <hsivonen@hsivonen.fi>
Date: Wed, 19 Feb 2014 15:50:02 +0200
To: www-tag <www-tag@w3.org>
Message-ID: <CANXqsR+qX4F7oAOET_PEtspko6mK45qW3PE3rUi8XyFvdw4Feg@mail.gmail.com>
Comments on the "initial opinion" version of
https://github.com/w3ctag/eme/blob/master/README.md ; quotes from that
document:

(The intro looks OK.)

> Content that is accessed through a plugin is also rarely accessible to other
> technologies in use on the web. It cannot be linked to from other pages
> (reducing the number of links that could be made, and thus reducing the
> value of the web).

This depends on whether the entire site navigation is done within the
plug-in. It's probably more common for a plug-in to appear on each
linkable page as on YouTube than for the entire site navigation to be
done within the plug-in. On the other hand, it's quite possible to put
the entire site navigation inside a plug-in-free Open Web Platform
program.

Therefore, I don't think non-linkability is a major plug-in-specific
disadvantage.

> One basic principle for web technologies that enables the web to grow
> and be valuable and prevents it from fragmenting is platform independence

Indeed.

> The Encrypted Media Extensions defined in the Encrypted Media Extensions
> Working Draft aim to provide a common, platform independent interface to
> enable sharing of encrypted content on the web.

I think this characterization is shameful.

As I have said earlier on this list, it is technically correct that
EME involves "encrypted" content,  but talking about encryption evokes
the wrong connotations about who the adversary is. (Typically,
encryption is used against a third party on the network. In the case
of EME, the user is the adversary against whom encryption is used.)
Describing the "aim" of EME without saying "DRM" up front is grossly
misleading.

Furthermore, saying that the "aim" of EME is to "enable sharing of
encrypted content" is Newspeakish, when the stated goal of DRM is to
*disable* sharing of content.

> So when considering encrypted media on the web, we have to ensure that
> the "black box" of encryption technologies is platform independent, as well
> as the interface that surrounds it.

Calling DRM implementations "encryption technologies" amounts to weasel words.

> The EME specification must ensure platform independence, for example
> by specifying the use of open standard encryption algorithms.

This statement is naïve in a way similar to diagnosing the
interoperability problems with .doc files to stem from insufficient
use of XML when .docx retains the same higher-level complexity.

While EME leaves the encryption part open-ended, both the MP4 CENC and
WebM encryption methods non-normatively (because the W3C doesn't in
general reference video formats normatively) referenced by EME are
based on 128-bit AES-CTR, which is publicly documented and
standardized.

Open standard encryption *algorithms* don't result in platform
independence, when the encryption *keys* are withheld from you based
on policies that discriminate by platform.

The design of EME is primarily informed by PlayReady. Other DRMs are
free to adapt to PlayReady's shape to become EME-compatible, but it's
instructive to look at PlayReady, since EME is made for
PlayReady-shaped DRMs. PlayReady is technically platform-independent,
but the policy level paints a different picture. Take a look at
http://go.microsoft.com/fwlink/?LinkID=122671 :
Section 2.1 says that the key parts of the license don't apply to "PC
Software", which is defined in Section 1.5 to include, for example,
software running on Windows, OS X, desktop Linux and Solaris. Now,
there's a different SDK for Windows with different terms, but tough
luck for non-Windows PC Software (except Microsoft's own Silverlight
for Mac).

And that's ignoring royalties and such.

I don't know what crypto algorithms PlayReady uses apart from AES-CTR,
but the public documentation strongly suggests that a PKI is involved.
It would be surprising if such a PKI wasn't based on open-standard
crypto algorithms such as RSA. But even if you implemented all the
right algorithms, your implementation wouldn't actually have utility
unless you got your keys signed such that they chain to the root of
trust of the PKI. That is, policy is enforced by refusing to sign your
keys—not by algorithms being non-standard.

> DRM systems usually prohibit any manipulations with content including
> displaying third-party subtitles,

As far as I can tell, nothing in EME itself or in shipped
implementations prevents compositing HTML/CSS content over a video
than uses EME-involved DRM. In fact, if you use Netflix with IE11 on
Windows 8.1, you'll find that the service *relies* on being able to
composite HTML/CSS content over EME-involved video frames.

> or (in case of e-books) reading a book using system voice-over engine,
> thus making a content less accessible by disabled people.

True, but how is this relevant to EME as long as browsers don't
provide client-side speech recognition or computer vision-based
description for DRMless videos?

> In the case of EME, according to the layering principle, it should be
> possible to specify a set of primitive APIs that would enable a DRM
> system to be built in Javascript. Many of the pieces that would be
> needed for this are already in progress, such as:
>
> * Streams API
> * Crypto API (with access to hardware crypto and tokens)
> * WebAudio API
> * Canvas (with tainting) for video output

The discussion of layering misses the point on at least three counts:
 1) EME integrates into the HTML5 video stack better than the APIs mentioned.
 2) The more layers a CDM subsumes, the better from the studio perspective.
 3) The royalty economics don't favor JS-based solutions from the site
perspective.

Let's look at each one in turn:

1) The Open Web Platform does not actually provide API surface for a
JavaScript program to receive elementary stream data (unencrypted or
encrypted) from the video stack and to return video frames as
ArrayBuffers or WebGL textures to the video stack. Yet, an EME CDM
gets to plug into the media stack on this sort of level. The point of
EME is to reuse  the HTML5 media stack to the maximum extent DRM
Robustness & Compliance rules allow. On the other hand, if you want to
implement codecs that output to Canvas or the Web Audio API, you have
to also re-implement the rest of the media stack in JS. In this sense,
EME is better at layering than what the draft opinion puts forward.

2) If EME was truly about "encryption technologies" and not about DRM,
to layer things properly assuming CENC files, you'd just stick the
"[de]cryption technology" between the MP4 demultiplexer and the code
that deals with unencrypted codec data in the existing media stack and
you'd be done. However, in such a properly layered design the
"[de]cryption technology"  would only have a chance to hide the
decryption key from the rest of the code. The decrypted elementary
stream data would be exposed to the User Agent, and in the DRM threat
model, the user is the adversary and the copyright holders tend to be
interested in hiding more than just the key.

Please take a look at the bulleted list at https://hsivonen.fi/eme/ .
Having the CDM merely decrypt is just a theoretical baseline. In
practice, a marketable CDM needs to also do decoding inside the black
box, which already conflates two layers. But that exposes the pixels
to the User (i.e. Adversary) Agent. To hide those and thereby get
access to a broader range of content, you need to conflate the layers
even more. Therefore, clean layering and DRM are at odds.

3) Now, what if the Open Web Platform had an API to enabled
Worker-backed codecs and Worker-backed CDMs? A JavaScript program
would run in a worker and the browser would send a piece of
unencrypted (in the mere codec case) or encrypted (in the CDM case)
elementary stream data to the Worker and the worker would put YUV
frame data in an ArrayBuffer at some offset and tell the browser which
ArrayBuffer and what offset. This way, the ArrayBuffer could be an
asm.js heap. Or, if WebGL in Workers happens, the Worker could put the
frame data in a WebGL texture.

Awesome, right? Now browsers wouldn't have to provide DRM. A site
could provide DRM as an asm.js program and, in principle, make it as
obscure as any native code disassembly.

Well, three problems:

 a) You'd only get to conflate the decryption and decoding layers. You
couldn't compete with the kind of device-resident DRM that hides the
pixels from the browser.

 b) You couldn't leverage hardware H.264 decoders, so the solution
would be less competitive on battery life (at least) than
device-resident CDMs.

 c) The site would be liable for third-party royalties that whoever
distributes the CDM to end users would be subject to, starting with
H.264 decoder royalties in a world where sites have a substantial
investment in using H.264 and device-resident CDMs use H.264. In the
case of device-resident CDMs, royalty liabilities arising from CDM
distribution are not the site's problem.

The mention of the possibility of an asm.js program being able to be
obscure leads to my last point:

> such as a secure JS worker that would only have access to a narrowly
> limited set of APIs, and would run in a special context that could not
> be inspected by the user

This is a terrible idea compared to EME. Please take this out of the opinion.

If the browser is responsible for making the program running in the
Worker non-inspectable, the "robutness" (i.e.  the capacity of the
solution to resist attempts by the end-user to read or write the data
inside the DRM box) of the solution hinges on the browser and the
browser itself falls into the DRM realm. This is worse than EME, which
maintains a separation between the User Agent (trusted by the user)
and the CDM (trusted by studios).

Since you seem to be worried about the browser becoming subject to
anti-circumvention law, such as the DMCA, you shouldn't want the
browser to fall into the DRM realm. That is, you shouldn't want the
robustness of the solution to hinge on the browser.

If you want a Worker-based solution, you should suggest the Worker to
be explicitly an inspectable white box and leave it to the
site-supplied JS program running in the provided environment to be
sufficiently obscure to defy inspection.

-- 
Henri Sivonen
hsivonen@hsivonen.fi
https://hsivonen.fi/
Received on Wednesday, 19 February 2014 13:50:30 UTC