[Bug 17673] Define Initialization Data for implementations that choose to support the ISO Base Media File Format

https://www.w3.org/Bugs/Public/show_bug.cgi?id=17673

--- Comment #14 from David Dorwin <ddorwin@google.com> ---
In no particular order, below are some of the issues with the current proposal
that we need to accept and/or address.

---------

Is there a subset of CENC possibilities that would make more sense and
significantly reduce complexity at all levels while still satisfying most/all
actual use cases?

If we did limit the possibilities (related to initData), applications could
still synthesize initData to do what they need to do.

---------

It appears that trying to support any scheme type (not just CENC) has led us to
include the entire |sinf| box along with |pssh| boxes. This may be adding
complexity and overhead that are otherwise unnecessary.

I don't understand the details, but it sounds like using |sinf| might also be
incompatible with SampleGroups. This means that supporting other scheme types
limits our ability to support features of specific scheme types.

Can we just say treat "cenc" files this way, treat "foob" files that way, etc.
and not rely on an overlap/common root? Since the container indicates the
scheme type, the UA should be able to provide the correct behavior for a given
stream. I guess the problem would be with createSession() where the scheme type
is not specified in |initData| or |type|. What other ISO BMFF scheme types
exist, and is it likely that a UA or key system would support multiple of them?
(Can multiple scheme types be supported in the same file? I'm not sure that
would be compatible with any of the current proposals.)

If we returned to just reporting |pssh| boxes, we could always fire a needkey
event with the |pssh| and rely on the de-duplication of sessions currently
under discussion to reduce network traffic. Depending on the stream layout,
that might still result in a lot of events and sessions. (Maybe this is
expected to be unlikely in practice.)

Alternatively, we could focus on the |sinf| box(es) and not send |pssh| boxes.
This would be much closer to the behavior for WebM, which is to only send
generic key IDs in needkey events. If services were able to only use the
information in the |sinf| box, content providers could support any key system
without needing to add a new PSSH to all existing files. This could also be
done in the |sinf| + |pssh| case, but including |pssh| boxes adds a lot of
(potentially) unnecessary complexity. This solution is still incompatible with
SampleGroups, though.

Going even further, we could pare down needkey/initData to just emit key IDs
(the first time they are encountered). This would be super simple, consistent
with WebM, and support any key system regardless of whether the stream has a
PSSH.

---------

The |tenc| box is sent (as part of the |sinf| box) and described, but most of
that information is irrelevant to the license server.

The only part of the |tenc| that might be useful is the default |KID|, which
could be used for protection systems (aka key systems) that are not explicitly
supported in the media stream. Perhaps most importantly, this includes Clear
Key. (If we don't come up with a generic solution, we'll need to define a
system ID and PSSH format for Clear Key and (in some scenarios) it would need
to be added to media streams.)

Even so, this is only one of the key IDs that might be needed for the stream.
Thus, a server would need to know all the KIDs associated with this default
KID. This is already necessary for PSSH formats that only include one key ID,
which I think might be common.

---------

The current proposal says, "for each 'moov' encountered, it _shall_ include, in
its entirety, every 'sinf' box that is an (indirect) descendant of a 'moov',
followed by every 'pssh' box that is a direct child of that 'moov'." This
sounds like a lot of parsing in the UA (or communication of parsing that is
already occurring to the EME implementation) and potentially a lot of |pssh|
boxes. (Can people with experience with a diverse set of CENC files comment on
the likelihood of this?

Also, the |pssh| boxes can appear separate from the |sinf| box(es) throughout
the file (in |moof| boxes?). Does this mean we might have to search the entire
file? Is this really what we want?

Note that the current proposal excludes by omission |pssh| boxes in |moof|
boxes. Such |pssh| boxes are only useful with SampleGroups, which aren't
supported as mentioned above, anyway. If we simplify to either only emitting
key IDs or only emitting all |pssh| boxes throughout the file, then |moof|
becomes relevant again.

---------

Specific comments on the text in the current proposal (not current spec draft):
 * The "when no CDM is selected" text is not necessary (in the object-oriented
version of the API). CDM selection is irrelevant to needkey events.
 * The "block initData" text can be removed since that no longer exists in the
algorithms.
 * Steve's text split definition of format and definition of derivation, but
(as discussed with him) that doesn't seem to be necessary anymore, and removing
this separation would make things simpler.
 * I wonder if there is anything generic we can say that would cover
Initialization Data for all BMFF protection schemes. Regardless, I think we
should put the CENC portions in section(s) that are explicitly CENC-only or
otherwise clearly identify CENC-specific text.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.

Received on Thursday, 25 July 2013 18:47:08 UTC