Re: Individualization from Henri Sivonen on 2014-10-24 (public-html-media@w3.org from October 2014)

From: Henri Sivonen <hsivonen@hsivonen.fi>
Date: Fri, 24 Oct 2014 11:58:59 +0300
To: David Dorwin <ddorwin@google.com>
Cc: Joe Steele <steele@adobe.com>, "<public-html-media@w3.org>" <public-html-media@w3.org>
Message-ID: <CANXqsRKSiHGgQXqGzk2k9hffvdarJggE5u3cVwJyDuqOUMKMOA@mail.gmail.com>
On Wed, Oct 22, 2014 at 5:13 AM, David Dorwin <ddorwin@google.com> wrote:
> [Subject changed  from "Clarifying key types and persistence"]
>
> On Mon, Oct 6, 2014 at 2:28 AM, Henri Sivonen <hsivonen@hsivonen.fi> wrote:
>>
>> On Wed, Apr 2, 2014 at 11:01 PM, David Dorwin <ddorwin@google.com> wrote:
>> > Thanks for summarizing the types. Comments inline.
>> >
>> > On Tue, Apr 1, 2014 at 2:34 PM, Joe Steele <steele@adobe.com> wrote:
>> ...
>> >>   App and device keys:  Keys that are bound to a particular device
>> >> and/or
>> >> application.
>> >
>> >
>> > Are you referring to individualization or provisioned keys? This hasn't
>> > been
>> > discussed, but I think such keys should be the responsibility of the
>> > user
>> > agent and outside the EME APIs (assuming it is per device and there are
>> > still appropriate safeguards and notifications).
>>
>> I think it's a valid design decision for a CDM to treat
>> individualization to happen out of the scope of EME. However, I think
>> it's wrong for the EME spec to require individualization to happen out
>> of the scope of EME or even tacitly assume that that's always the
>> case. I think the spec should also cater (in the sense of catering to
>> the persistence implications) to the case where individualization
>> happens via the EME API.
>
>
> I assume you are referring to per-origin individualization, which Joe has
> previously mentioned.

As far as spec changes are concerned, I am referring to download-based
individualization in general. However, the constraints Firefox places
on the CDM are supposed to make origin-independent download-based
individualization impossible, so my immediate interest is with the
origin-dependent case.

>> Here are two completely reasonable examples of possibilities of
>> individualizing via EME:
>>
>> The common part:
>> The Key System has a "individualization needed" message. When the CDM
>> needs to be individualized, it emits this message and expects to
>> receive an individualization blackbox (IBX) as a response EME message.
>>
>> Two options for the next step:
>
>
> At least Option 1 must be supported because applications should not be
> required to handle message types.
>>
>>
>> Option 1: The above-mentioned message gets relayed to the Key Server
>> like any other EME message without the JS app treating it differently
>> from other EME messages. The Key Server recognizes the requests as an
>> IBX request and proxies the request to an individualization server and
>> forwards the response IBX to the JS program, which pushes the IBX to
>> the CDM as an EME message.
>
>
> Do you really mean IBX? The Microsoft documentation [1] says that IBX is
> software.

I may have misunderstood the precise meaning of "IBX" and may have
used the term too broadly. I meant what ever blob is given to the CDM
in response to an individualization request. I didn't intend to imply
anything about the contents of that blob for any particular DRM
system.

> It seems like a really bad idea to accept executable code to
> become part of the user agent from a web application.

As long as the response is signed by the CDM vendor, making the bits
travel through the application as safe in terms of the risk of the
application tampering with the bits as downloading directly from the
CDM vendor would be. If the response is also encrypted, there isn't
even a risk of the application learning anything interesting from the
bits. The application does learn that the CDM hasn't been
individualized yet, of course, but even in the case of out-of-band
individualization, if the individualization is lazy/just-in-time, the
application can deduce the lack of pre-existing individualization from
the latency. (Unscientifically, it seems to me that if you take IE11
on a fresh install of Windows 8.1 to the Netflix IE Test Drive, the
delay from clicking "play" to the playback starting is substantially
longer than in the case where that install of IE11 Windows 8.1 has
been used to play PlayReady content previously.) That is to say, if
you are concerned about the application learning whether the CDM has
been individualized already, you have to do origin-independent eager
ahead-of-time individualization.

> Does your case involve executable code or something simpler like an ID or
> certificate?

I'll let Joe answer this.

> [1] http://msdn.microsoft.com/en-us/library/cc838192(VS.95).aspx
>
>> Option 2: The above-mentioned message is routed to an
>> individualization server by the JS app either by looking inside the
>> message ArrayBuffer (worse fit for the design of EME), or if
>>
>> https://dvcs.w3.org/hg/html-media/raw-file/tip/encrypted-media/encrypted-media.html#dom-mediakeymessagetype
>> is extended to flag the message type as "individualizationrequest", by
>> the event type (better fit for the design of EME). The JS app pushes
>> the response IBX to the CDM via EME. The individualization server, of
>> course, needs to permit the request using CORS and needs to support
>> https to avoid mixed-content blocking if the EME-using site is using
>> https.
>
>
> The per-origin identifiers that presumably come with per-origin
> individualization are good for privacy. However, it's unclear whether
> deferring such individualization to a centralized server maintains those
> qualities.

This depends on what data the individualization request contains.
Firefox provides the CDM with some bits that are unique to the
computer, the origin using EME, the origin in the URL and a
randomly-generated salt. What the CDM does is up to the CDM, but even
if the CDM sent these bits verbatim to a centralized server, the
centralized server wouldn't learn anything from these bits.

AFAICT, a centralized individualization server can always tally
information that's baked into the CDM. Potentially, if the CDM builds
for different platforms have different in-baked information, the
centralized server could count individualizations by platform, but the
platform is already information that browsers expose left and right in
the UA string (albeit not in a form that's hard for the user to
forge).

If the server of the application proxies the individualization
requests to the central server, the central server may not even get to
learn the IP address of the client the CDM. (Though, of course, the
browser has no proof that the application's server won't pass this
information along.)

If the JS program of the application XHRs the individualization
request directly to the central server (and the central server
authorizes this via CORS), the centralized server learns that the IP
address of the client wanted to individualize for the origin of the
application. Also, if the central server requests credentials and the
user has for other reasons browsed the site of the CDM vendor to
obtain a cookie, the central server can match individualizations with
that cookie. That may not be cool, but at least the credentials can't
be requested covertly from people who care to inspect what CORS
headers are sent, so a CDM vendor doing this would get caught and
getting blogged about.

> (I am assuming the reason for the proxying is that the
> "individualization server" is not run by the application provider.)

The assumption I have is that the individualization server would be
run by the CDM vendor.

> Is use of a centralized server necessary? Maybe there is another way to
> achieve per-origin IDs without contacting a central server for each origin.

Logically, that's possible in the abstract. I even said how in
https://www.w3.org/Bugs/Public/show_bug.cgi?id=26332#c79 . However, I
think EME should allow for DRM designs that rely on download-based
individualization, since requiring a redesign of systems that use
download-based individualization would be quite a barrier to entry. In
particular, the family of solutions I described in
https://www.w3.org/Bugs/Public/show_bug.cgi?id=26332#c79 assumes that
the bits available for the process are not only unique enough to be
suitable for node locking but also have proper entropy to be suitable
for serving as a seed for key generation. With download-based
individualization, the unique bits extracted locally only need to be
good enough for node locking while entropy suitable for key generation
can be left up to the individualization server.

-- 
Henri Sivonen
hsivonen@hsivonen.fi
https://hsivonen.fi/
Received on Friday, 24 October 2014 08:59:27 UTC