Re: [EME] reuse of session from Mark Watson on 2014-06-17 (public-html-media@w3.org from June 2014)

From: Mark Watson <watsonm@netflix.com>
Date: Mon, 16 Jun 2014 17:23:01 -0700
To: "Maruyama, Shinya" <Shinya.Maruyama@jp.sony.com>
Cc: David Dorwin <ddorwin@google.com>, "public-html-media@w3.org" <public-html-media@w3.org>
Message-ID: <CAEnTvdA9YpLz89LkNiQ1ESCUcEiDrdiP29nQ57PkrDU-wPBSYQ@mail.gmail.com>
On Mon, Jun 16, 2014 at 5:17 PM, Maruyama, Shinya <
Shinya.Maruyama@jp.sony.com> wrote:

>    *From:* Mark Watson [mailto:watsonm@netflix.com]
> *Sent:* Tuesday, June 17, 2014 8:59 AM
>
> *To:* Maruyama, Shinya
> *Cc:* David Dorwin; public-html-media@w3.org
> *Subject:* Re: [EME] reuse of session
>
>
>
>
>
>
>
> On Mon, Jun 16, 2014 at 4:46 PM, Maruyama, Shinya <
> Shinya.Maruyama@jp.sony.com> wrote:
>
> I think you are right if EME specifies this best practice normatively;
> i.e. specifies the requirements for manifests, media segments and DRM so
> that the application conformant with the requirements can safely ignore the
> needkey (“ignore needkey” looks ad-hoc solution though).
>
>
>
> I think, however, that is overkill spec. If we just specifies it
> informatively, generic application still should work just as specified by
> EME, that is, the application should call createSession whenever receiving
> the needkey event. This is nature of current model. (Although practically
> an application may be aware of something to ignore the events safely, it
> just relies on developer’s optimization).
>
>
>
> I don't understand why we would need to specify that. The application is
> provided by a service provider who is also the one providing the manifest.
> They know when they write the application that it will not rely on needkey.
>
>
>
> Personally, I’m fine with it. My thought is just focusing on how to
> specify it. At least, the best practice should not be ambiguous so that the
> people who are not engaged in this discussion can also be aware of it.
>
> I also do not stick to adding KID-granular comparison. I’m happy with the
> solution whatever specified clearly (if there is a reasonable reason to
> choose it).
>
>
>
>
>
> Perhaps someone writing a general-purpose library that will be used in a
> number of different contexts needs to think about what modes they will
> support or how they will discover what model a given service provider uses.
> That's a question for the library developer to decide on the modes they
> support and give the library user the choice. I wouldn't have though
> indications from EME would be reliable enough to drive that, unless we have
> an enumeration now of the various usage models.
>
>
>
> I thought one of our goals is to have an interoperable application to
> cover wide range of media presentations and its usage.
>
> (I do not think it’s easy thoug…)
>

No, I don't recall that being a goal. The web platform provides site
developers with tools that they can use to develop sites / applications.
Sometimes those are low-level tools and so people create higher-level
libraries, but there is no more a need to have a single "universal media
player" application than there is a need to have a "universal web site"
application.


>
>
>
>
> This does seem to me an example where we need to focus back on concrete
> usage models. Where we have a detailed model in front of us, we can decide
> whether that model is to be supported in this first version and if so
> exactly how, but without more requirements-level specification of the model
> it's very hard to tell what support is needed.
>
>
>
> Agreed. Better to discuss after all the models are lined up in front of us.
>
>
>
> By the way, I have an another question. Is ‘ignore needkey’ necessary for
> WebM?
>
> If “no”, it might be better to seek consistent and container-independent
> behavior.
>
>
>
> Thanks,
>
> Shinya
>
>
>
>
>
> ...Mark
>
>
>
>
>
>
>
>
>
>
>
> *From:* Mark Watson [mailto:watsonm@netflix.com]
> *Sent:* Tuesday, June 17, 2014 8:08 AM
> *To:* Maruyama, Shinya
>
>
> *Cc:* David Dorwin; public-html-media@w3.org
> *Subject:* Re: [EME] reuse of session
>
>
>
> In this case, isn't the application aware that it is getting the necessary
> initData from the MPD and so can safely ignore the needkey events ?
>
>
>
> ...Mark
>
>
>
> On Mon, Jun 16, 2014 at 4:02 PM, Maruyama, Shinya <
> Shinya.Maruyama@jp.sony.com> wrote:
>
> I should have added steps of needkey events.
>
> Those steps are triggered because application receives the needkey.
>
> For example, 1 to 8 is the case of VOD content playback and 1 to 16 is the
> case of live streaming requiring key rotation.
>
>
>
> Currently steps 5, 8, 12, 13 and 16 causes creating extra sessions and
> then result in acquiring duplicated license because raw intiData comparison
> cannot detect the subset of KIDs being supplied.
>
>
>
> 1) Fetch MPD1
>
> 2) createSession(KIDv1, KIDa1 in MPD1) -> Resolved with session1 and KIDv1
> and KIDa1 is stored in UA  // this session is created proactively without
> receiving needkey
>
> 3) Fetch video1(KIDv1) segment
>
> 4) needkey(KIDv1 in moof) is fired // In the case of MSE, typically KIDv1
> is delivered by initialization segment containing pssh in moov
>
> 5) createSession(KIDv1) -> Resolved with null // because KIDv1 is already
> included in active session list
>
> 6) Fetch audio1 segment
>
> 7) needkey(KIDa1 in moof) is fired // In the case of MSE, typically KIDa1
> is delivered by initialization segment containing pssh in moov
>
> 8) createSession(KIDa1 in moof) -> Resolved with null
>
> -------------- if key rotation happens --------------
>
> 9) Fetch MPD2
>
> 10) createSession(KIDv2, KIDa2 in MPD2) -> Resolved with session2 // this
> session is created proactively without receiving needkey
>
> 11) Fetch video2 segment
>
> 12) needkey(KIDv2 in moof) is fired // In the case of MSE, typically KIDv2
> is delivered by initialization segment containing pssh in moov
>
> 13) createSession(KIDv2 in moof) -> Resolved with null
>
> 14) Fetch audio2 segment
>
> 15) needkey(KIDa2 in moof) is fired // In the case of MSE, typically KIDa2
> is delivered by initialization segment containing pssh in moov
>
> 16) createSession(KIDa2 in moof) -> Resolved with null
>
>
>
>
>
> *From:* Mark Watson [mailto:watsonm@netflix.com]
> *Sent:* Monday, June 16, 2014 11:57 PM
> *To:* Maruyama, Shinya
> *Cc:* David Dorwin; public-html-media@w3.org
>
>
> *Subject:* Re: [EME] reuse of session
>
>
>
> I'm afraid I have not been following this whole thread, but why do you
> have steps 4, 6, 10 and 12 at all below ?
>
>
>
> ...Mark
>
> Sent from my iPhone
>
>
> On Jun 15, 2014, at 7:33 PM, "Maruyama, Shinya" <
> Shinya.Maruyama@jp.sony.com> wrote:
>
>  Please see replies inline.
>
>
>
> *From:* David Dorwin [mailto:ddorwin@google.com <ddorwin@google.com>]
> *Sent:* Saturday, June 14, 2014 3:31 AM
> *To:* Maruyama, Shinya
> *Cc:* public-html-media@w3.org
> *Subject:* Re: [EME] reuse of session
>
>
>
>
>
>
>
> On Thu, Jun 12, 2014 at 8:34 PM, Maruyama, Shinya <
> Shinya.Maruyama@jp.sony.com> wrote:
>
>     Does the KID array really help? The user agent would still need to
> (asynchronously) ask the CDM if it has each key ID. Even if it did, this
> only addresses a subset of content in one particular format.
>
>
>
>
>
> I’m not sure why you think the user agent needs to ask the CDM.
>
> Current model does not take care of “active session Initialization Data“ delivering
> what set of KIDs to CDM () or does not  ensure the preceding session have
> completed successfully. It just relies on the same initData will result in
> the same license and don’t care the result of pending license
> (createSession is resolved with null even though the preceding session may
> fails to acquire the license).
>
> KID-granular comparison is basically the same. The initData should result
> in the license delivering all the KIDs contained in the initData (maybe it
> delivers extra KIDs though). It just ensures that the KIDs listed in UA
> will be or have been made available to CDM. This is the same assumption
> which the current model relies on. If a preceding license delivers extra
> KIDs, unnecessary session may be created. However it is not worse than
> current model, either.
>
>  I don't see how it is *better* and thus why we should add special
> behavior for one format (CENC) or a dependency on CENC second edition.
> (Actually, WebM would work fine because the initData *is* a KID.)
>
>
>
> If, for example, audio and video streams are encrypted with different
> keys, they will have different PSSH boxes with different KID values. *If* the
> first session results in a license for *both* keys, the application and
> user agent will not know this. Only the CDM knows that it already has keys
> for both KIDs. Thus, the user agent can't do anything with knowledge of the
> KIDs in the initData. It would still create multiple sessions because the
> KIDs are different just as the entire initData is different.
>
>
>
> The case above is a bad practice we cannot de-dup licenses.
>
>
>
> What specifically is a bad practice? That all seems pretty standard.
>
>
>
> I just compared it to the best practice below.
>
>
>
>
>
>
>
>
>
> The one case where this might not be true is if there was fake initData
> (i.e. from a manifest) that contained both KIDs. Maybe this is what you are
> referring to below. However, this can probably be addressed by using a real
> PSSH box from one of the streams. If the license server is capable of
> returning a license for all KIDs based on a PSSH box containing just one of
> them, there is no reason to include all the KIDs in the manifest (if you
> are concerned about duplicate sessions in the key rotation case).
>
>
>
> The best practice I mentioned below is the case KID comparison in user
> agents gives much help.
>
> Actually, DASH-IF and common encryption 2nd edition are addressing the
> delivery of all the KIDs in the manifest.
>
>
>
>
>
> The biggest advantage on introducing KID-granular comparison is that it
> helps to realize a best practice. For example, if manifest file delivers
> the pssh containing all KIDs for the presentation, the application can
> first call createSession with the pssh. Then, the subsequent media segment
> does not cause unnecessary sessions even though pssh is contained in moov
> or moof box.
>
> Raw initData comparison cannot make it because pssh is different among
> manifest, moov and moof.
>
>  That's interesting - why are they different?
>
>
>
> Is it because, as you said above, that the PSSH box from the manifest?
> What does the PSSH box in the moov contain? Why does the moov need a PSSH
> box?
>
>
>
> Yes, the first pssh extracted from the manifest contains all the KID for
> the presentation because the manifest is a something to cover the entire
> streams.
>
> Subsequent pssh may come from either moov or moof to support random access
> or trick play. Typically, as media segment contains a single track, those
> pssh’s are not the same unless the particular constraint is
> specified/operated like HbbTV restricting the Initialization Segment to
> being the common among all representations.
>
>
>
> Are you still talking about a key rotation scenario? When you say "the
> manifest contains all the KID for the presentation", are you referring to
> all rotation periods or just for all streams in the current period? In the
> former case, you can ignore the needkey events. In the latter, you are
> still going to have problems for subsequent periods (see below).
>
>
>
> It’s not limited to key rotation scenario. This sort of best practice
> would be generally useful for single track based media segment with
> different key encryption.
>
> As to "the manifest contains all the KID for the presentation", I was
> referring to the case where a manifest contains KIDs for audio, video
> streams (irrespective of VoD or live streaming).
>
>
>
>
>
> The PSSH box in the moof only contains key ID(s) for that specific track
> (e.g. video), right? If so, you'll have the same problem of different KIDs
> in the needkey events in a future rotation period.
>
>
>
> In the case of DASH live streaming, MPD update mechanism can be used to
> create a new session with using updated MPD before key rotation happens.
>
>
>
>
>
> I must be missing something. In addition to answering the questions above,
> it might help to provide an explicit example - what KID(s) are in the
> manifest, each PSSH box, each license, etc.
>
>
>
> 1)      Fetch MPD1
>
> 2)      createSession(KIDv1, KIDa1 in MPD1) -> Resolved with session1
>
> 3)      Fetch video1 segment
>
> 4)      createSession(KIDv1 in moof) -> Resolved with null
>
> 5)      Fetch audio1 segment
>
> 6)      createSession(KIDa1 in moof) -> Resolved with null
>
> 7)      Fetch MPD2
>
> 8)      createSession(KIDv2, KIDa2 in MPD2) -> Resolved with session2
>
> 9)      Fetch vieo2 segment
>
> 10)  createSession(KIDv2 in moof) -> Resolved with null
>
> 11)  Fetch audio2 segment
>
> 12)  createSession(KIDa2 in moof) -> Resolved with null
>
> …
>
>
>
>
>
>
>
Received on Tuesday, 17 June 2014 00:23:30 UTC