- From: guy paskar <guypaskar@gmail.com>
- Date: Wed, 20 Feb 2013 15:30:43 +0200
- To: Aaron Colwell <acolwell@google.com>
- Cc: Cyril Concolato <cyril.concolato@telecom-paristech.fr>, "<public-html-media@w3.org>" <public-html-media@w3.org>
- Message-ID: <CALCEMzEGycw-yNteLXbzkEpvh2c4pazSPcAerMsUSR=TLVJsfg@mail.gmail.com>
I think this is a very interesting topic and here are a couple of comments from my point of view: 1) I understand that fmp4 is needed for MSE as it is today. The main problem I see with it is that current browsers can't play natively the fmp4 (they have to download the whole file as opposed to regular mp4) and as a result one who uses MSE for streaming videos still have to hold regular mp4 to support browsers that do not support MSE (and can't pseudo stream fmp4) - is there a way to overcome this? will this be addressed? I think this is a serious issue. 2) With relation to point 1, is it possible to make a regular mp4 fragmented on the fly with some kind of parser? i.e to still hold a regular mp4 on the server and when needed convert it (by parts) to fmp4 on the client - because fragmentation a regular mp4 is an "easy" task I thought it might be possible. I know that the youtube guys intended to do something similar in their demo. any comments on that? Guy On Fri, Feb 15, 2013 at 7:59 PM, Aaron Colwell <acolwell@google.com> wrote: > Comments inline... > > > On Fri, Feb 15, 2013 at 6:20 AM, Cyril Concolato < > cyril.concolato@telecom-paristech.fr> wrote: > >> Hi Aaron, >> >> Le 14/02/2013 23:35, Aaron Colwell a écrit : >> >> Hi Giuseppe, >> >> There are no current plans to support non-fragmented MP4 files. One >> thing to remember is that MSE accepts byte streams and not files per se. >> For MP4 we use the fragmented format because it allows segments of the >> timeline to be appended easily and in any order. >> >> Appending segments in any order may not be so easy. The MP4 spec says in >> the Movie Fragment Header Box definition: >> "The movie fragment header contains a sequence number, as a safety check. >> The sequence number usually starts at 1 and must increase for each movie >> fragment in the file, in the order in which they occur. This allows readers >> to verify integrity of the sequence; it is an error to construct a file >> where the fragments are out of sequence." >> >> So if you implement MSE on top of a conformant MP4 reader, feeding >> segment data as if they were consecutive in a 'virtual' file, this won't >> work. Segments with a sequence number smaller than the one of the first >> segment provided may be rejected. To make sure, the append happens >> correctly with an unmodified MP4 parser, the MSE implementation will have >> to parse each segment, check the sequence number and if needed reinitialize >> the parser before feeding the out-of-order segment. >> > > [acolwell] MSE ignores the sequence number. Again it is important to not > think in terms of files, but in terms of a bytestream. MSE accepts an > bytestream that looks very close to a fragmented ISOBMFF file, but it > allows things that aren't compliant with the ISOBMFF spec. If one decides > to use a conformat MP4 reader for an MSE implementation then they will have > to relax parts of the validation checks to avoid rejecting bytestream > constructs that are allowed by MSE. MSE accepts the ISOBMFF fragmented form > because it is a relatively simple way to encapsulate segments of a larger > presentation. The intent was never to support the fragmented file format, > but rather something close enough to that form to make it easy to use > segments of MP4 content to construct a presentation. > > >> >> >> Supporting non-fragmented files would require the UA to hold the whole >> file in memory which could be very problematic on memory constrained >> devices. >> >> Is there any requirement in MSE to keep the data in memory and not on >> disk? >> > > [acolwell] That really is beside the point. Sure disk could be used, but > in the mobile or TV case that isn't likely to be an option. Having disk > just delays the problem a little longer. Eventually the UA may need to > evict part of the timeline covered by the file and you'd have to reappend > the whole file again to get that region back. > > >> >> If the UA decides to garbage collect part of the presentation timeline >> to free up space for new appends it is not clear how the web application >> could reappend the garbage collected regions without appending the whole >> file again. >> >> You could tell the same thing about non-fragmented files if the fragment >> is very long. Sure, you will more likely find large non-fragmented files >> than fragmented files with large fragments but the problem is the same. >> Small non-fragmented files (such as small clips, ads) should not be >> excluded. >> > > [acolwell] I agree, but it is much more likely for people to create large > & long non-fragmented MP4 files than it is for people to create fragmented > files with large fragments. Forcing the conversion to fragmented form > forces this issue to be considered. Another issue is that "small clips" is > a very subjective thing and once we say non-fragmented MP4 is supported > people will expect it to work no matter how long the files are. > > >> >> >> The fragmented form allows the application to easily select the desired >> segment and reappend it. >> >> If possible! If your fragments are large, the application won't be able >> to do it (at least not easily). >> > > [acolwell] True, but that's further incentive to keep the segments > reasonable size. If the UA keeps evicting parts of the segments that is an > indication that the fragment size you are using is too big. At least with > fragmented files you have this parameter to tune. With non-fragmented files > you are simply out of luck and have to convert to fragmented form to > resolve it anyways. > > >> >> >> >> Applications can control the level of duplicate appending by adjusting >> the fragment size appropriately. >> >> I think you mean 'Content provider can control ...'? Web applications >> should be able to use content from different sources, with no control over >> the content generation, so possibly using non-fragmented files. >> > > [acolwell] If they intend to use it the content with MSE then they need to > be able to exert some sort of control. It may be that they have to convince > their partners to provide their assets in fragmented form. MSE already puts > constraints on what types of content can be spliced together so the > application needs to have at least some idea about what it is passing to > MSE to insure it doesn't violate any of the constraints such as no codec > changes, consistent track counts, etc. > > >> Non-fragmented files are so permissive about how they can store >> samples, there is no simple way to collect segments of the timeline w/o >> essentially exposing a random access file API. >> >> Which exact difference in the non-fragmented (vs the fragmented storage) >> is problematic? For which situation? I don't understand what you mean by >> 'collect segments of the timeline'. Which entity would need to do that? The >> web application? the MSE implementation? the MP4 parser? It is certainly >> easy for the MP4 parser. >> > > [acolwell] Non-fragmented files have a lot of options when it comes to how > the file is formatted. Is the moov at the beginning or end? Are the samples > stored in the mdat in order or are they randomly distributed? The list goes > on. This adds a lot of complexity and in the worst case requires the whole > file to be available to resolve. In fragmented files this complexity is > relatively bounded and the content author actually has to be proactive > about making sure the content is in the right format. Sure people can > create crazy fragmented files as well, but that is not nearly as common. > > >> >> In general, I think MSE can be viewed as an API to construct a playlist >> and have seamless playback of the elements of the playlist in an HTML5 >> video element. There are complex playlist configurations with overlapping >> elements. I think the simple use case of seamlessly playing 2 MP4 files >> sequentially should be supported. >> > > [acolwell] MSE is not an API to construct playlists and I don't think it > is good to think about it this way. If that was my goal then I would have > designed things very differently. MSE is an API to construct a presentation > from a set of media segments. Media segments do not necessarily imply fully > formed files in existing formats. Certain forms of existing file formats > are interpreted as media segments by MSE, but fully formed files are not > required to add media segments to the presentation. For example in the WebM > bytestream all you need to create is a Cluster element to add media to the > presentation. You don't need to create a fully formed WebM file. For ISO, > you only need a moof box followed by an mdat box to add media. I explicitly > wanted to break the constraints of requiring fully formed files because I > believe it would allow people to mashup content in ways that would be > difficult within the constraints of existing file formats. > > [acolwell] Seemlessly playing 2 MP4 files sequentially is supported if you > first convert them to fragmented form. In my opinion it is better to have > the content author place the content in a specific form then to require all > UAs to have to deal with non-fragmented MP4 files. > > Aaron > >
Received on Wednesday, 20 February 2013 13:31:36 UTC