Re: [EME] reuse of session from Mark Watson on 2014-06-16 (public-html-media@w3.org from June 2014)

From: Mark Watson <watsonm@netflix.com>
Date: Mon, 16 Jun 2014 07:57:20 -0700
To: "Maruyama, Shinya" <Shinya.Maruyama@jp.sony.com>
Cc: David Dorwin <ddorwin@google.com>, "public-html-media@w3.org" <public-html-media@w3.org>
Message-ID: <76507795558288038@unknownmsgid>
I'm afraid I have not been following this whole thread, but why do you have
steps 4, 6, 10 and 12 at all below ?

...Mark

Sent from my iPhone

On Jun 15, 2014, at 7:33 PM, "Maruyama, Shinya" <Shinya.Maruyama@jp.sony.com>
wrote:

  Please see replies inline.



*From:* David Dorwin [mailto:ddorwin@google.com <ddorwin@google.com>]
*Sent:* Saturday, June 14, 2014 3:31 AM
*To:* Maruyama, Shinya
*Cc:* public-html-media@w3.org
*Subject:* Re: [EME] reuse of session







On Thu, Jun 12, 2014 at 8:34 PM, Maruyama, Shinya <
Shinya.Maruyama@jp.sony.com> wrote:

    Does the KID array really help? The user agent would still need to
(asynchronously) ask the CDM if it has each key ID. Even if it did, this
only addresses a subset of content in one particular format.





I’m not sure why you think the user agent needs to ask the CDM.

Current model does not take care of “active session Initialization
Data“ delivering
what set of KIDs to CDM () or does not  ensure the preceding session have
completed successfully. It just relies on the same initData will result in
the same license and don’t care the result of pending license
(createSession is resolved with null even though the preceding session may
fails to acquire the license).

KID-granular comparison is basically the same. The initData should result
in the license delivering all the KIDs contained in the initData (maybe it
delivers extra KIDs though). It just ensures that the KIDs listed in UA
will be or have been made available to CDM. This is the same assumption
which the current model relies on. If a preceding license delivers extra
KIDs, unnecessary session may be created. However it is not worse than
current model, either.

 I don't see how it is *better* and thus why we should add special behavior
for one format (CENC) or a dependency on CENC second edition. (Actually,
WebM would work fine because the initData *is* a KID.)



If, for example, audio and video streams are encrypted with different keys,
they will have different PSSH boxes with different KID values. *If* the
first session results in a license for *both* keys, the application and
user agent will not know this. Only the CDM knows that it already has keys
for both KIDs. Thus, the user agent can't do anything with knowledge of the
KIDs in the initData. It would still create multiple sessions because the
KIDs are different just as the entire initData is different.



The case above is a bad practice we cannot de-dup licenses.



What specifically is a bad practice? That all seems pretty standard.



I just compared it to the best practice below.









The one case where this might not be true is if there was fake initData
(i.e. from a manifest) that contained both KIDs. Maybe this is what you are
referring to below. However, this can probably be addressed by using a real
PSSH box from one of the streams. If the license server is capable of
returning a license for all KIDs based on a PSSH box containing just one of
them, there is no reason to include all the KIDs in the manifest (if you
are concerned about duplicate sessions in the key rotation case).



The best practice I mentioned below is the case KID comparison in user
agents gives much help.

Actually, DASH-IF and common encryption 2nd edition are addressing the
delivery of all the KIDs in the manifest.





The biggest advantage on introducing KID-granular comparison is that it
helps to realize a best practice. For example, if manifest file delivers
the pssh containing all KIDs for the presentation, the application can
first call createSession with the pssh. Then, the subsequent media segment
does not cause unnecessary sessions even though pssh is contained in moov
or moof box.

Raw initData comparison cannot make it because pssh is different among
manifest, moov and moof.

 That's interesting - why are they different?



Is it because, as you said above, that the PSSH box from the manifest? What
does the PSSH box in the moov contain? Why does the moov need a PSSH box?



Yes, the first pssh extracted from the manifest contains all the KID for
the presentation because the manifest is a something to cover the entire
streams.

Subsequent pssh may come from either moov or moof to support random access
or trick play. Typically, as media segment contains a single track, those
pssh’s are not the same unless the particular constraint is
specified/operated like HbbTV restricting the Initialization Segment to
being the common among all representations.



Are you still talking about a key rotation scenario? When you say "the
manifest contains all the KID for the presentation", are you referring to
all rotation periods or just for all streams in the current period? In the
former case, you can ignore the needkey events. In the latter, you are
still going to have problems for subsequent periods (see below).



It’s not limited to key rotation scenario. This sort of best practice would
be generally useful for single track based media segment with different key
encryption.

As to "the manifest contains all the KID for the presentation", I was
referring to the case where a manifest contains KIDs for audio, video
streams (irrespective of VoD or live streaming).





The PSSH box in the moof only contains key ID(s) for that specific track
(e.g. video), right? If so, you'll have the same problem of different KIDs
in the needkey events in a future rotation period.



In the case of DASH live streaming, MPD update mechanism can be used to
create a new session with using updated MPD before key rotation happens.





I must be missing something. In addition to answering the questions above,
it might help to provide an explicit example - what KID(s) are in the
manifest, each PSSH box, each license, etc.



1)      Fetch MPD1

2)      createSession(KIDv1, KIDa1 in MPD1) -> Resolved with session1

3)      Fetch video1 segment

4)      createSession(KIDv1 in moof) -> Resolved with null

5)      Fetch audio1 segment

6)      createSession(KIDa1 in moof) -> Resolved with null

7)      Fetch MPD2

8)      createSession(KIDv2, KIDa2 in MPD2) -> Resolved with session2

9)      Fetch vieo2 segment

10)  createSession(KIDv2 in moof) -> Resolved with null

11)  Fetch audio2 segment

12)  createSession(KIDa2 in moof) -> Resolved with null

…
Received on Monday, 16 June 2014 14:57:56 UTC