- From: Aaron Colwell <acolwell@google.com>
- Date: Wed, 20 Feb 2013 09:41:33 -0800
- To: guy paskar <guypaskar@gmail.com>
- Cc: Cyril Concolato <cyril.concolato@telecom-paristech.fr>, "<public-html-media@w3.org>" <public-html-media@w3.org>
- Message-ID: <CAA0c1bA3FF=7ZZ7_m-QOskC+xM+5a9eiPDMj-LHkM8wxiOV17Q@mail.gmail.com>
Comments inline... On Wed, Feb 20, 2013 at 5:30 AM, guy paskar <guypaskar@gmail.com> wrote: > I think this is a very interesting topic and here are a couple of comments > from my point of view: > > 1) I understand that fmp4 is needed for MSE as it is today. The main > problem I see with it is that current browsers can't play natively the fmp4 > (they have to download the whole file as opposed to regular mp4) and as a > result one who uses MSE for streaming videos still have to hold regular mp4 > to support browsers that do not support MSE (and can't pseudo stream fmp4) > - is there a way to overcome this? will this be addressed? I think this is > a serious issue. > [acolwell] I don't view this as any different then having to support multiple files because browsers support different codecs & resolutions. If browsers want to ease this particular pain for content providers then they can implement support for fragmented mp4 in the standard HTML5 video path. I have a feeling this will happen naturally as fragmented mp4 files become more common on the Internet because of MPEG-DASH and MSE. > > 2) With relation to point 1, is it possible to make a regular mp4 > fragmented on the fly with some kind of parser? i.e to still hold a regular > mp4 on the server and when needed convert it (by parts) to fmp4 on the > client - because fragmentation a regular mp4 is an "easy" task I thought it > might be possible. I know that the youtube guys intended to do something > similar in their demo. any comments on that? > I don't see any reason you couldn't do this in JavaScript on the client. I think it depends on the application whether this path is the preferred option or not. Aaron > > Guy > > > > > > > On Fri, Feb 15, 2013 at 7:59 PM, Aaron Colwell <acolwell@google.com>wrote: > >> Comments inline... >> >> >> On Fri, Feb 15, 2013 at 6:20 AM, Cyril Concolato < >> cyril.concolato@telecom-paristech.fr> wrote: >> >>> Hi Aaron, >>> >>> Le 14/02/2013 23:35, Aaron Colwell a écrit : >>> >>> Hi Giuseppe, >>> >>> There are no current plans to support non-fragmented MP4 files. One >>> thing to remember is that MSE accepts byte streams and not files per se. >>> For MP4 we use the fragmented format because it allows segments of the >>> timeline to be appended easily and in any order. >>> >>> Appending segments in any order may not be so easy. The MP4 spec says in >>> the Movie Fragment Header Box definition: >>> "The movie fragment header contains a sequence number, as a safety >>> check. The sequence number usually starts at 1 and must increase for each >>> movie fragment in the file, in the order in which they occur. This allows >>> readers to verify integrity of the sequence; it is an error to construct a >>> file where the fragments are out of sequence." >>> >>> So if you implement MSE on top of a conformant MP4 reader, feeding >>> segment data as if they were consecutive in a 'virtual' file, this won't >>> work. Segments with a sequence number smaller than the one of the first >>> segment provided may be rejected. To make sure, the append happens >>> correctly with an unmodified MP4 parser, the MSE implementation will have >>> to parse each segment, check the sequence number and if needed reinitialize >>> the parser before feeding the out-of-order segment. >>> >> >> [acolwell] MSE ignores the sequence number. Again it is important to not >> think in terms of files, but in terms of a bytestream. MSE accepts an >> bytestream that looks very close to a fragmented ISOBMFF file, but it >> allows things that aren't compliant with the ISOBMFF spec. If one decides >> to use a conformat MP4 reader for an MSE implementation then they will have >> to relax parts of the validation checks to avoid rejecting bytestream >> constructs that are allowed by MSE. MSE accepts the ISOBMFF fragmented form >> because it is a relatively simple way to encapsulate segments of a larger >> presentation. The intent was never to support the fragmented file format, >> but rather something close enough to that form to make it easy to use >> segments of MP4 content to construct a presentation. >> >> >>> >>> >>> Supporting non-fragmented files would require the UA to hold the >>> whole file in memory which could be very problematic on memory constrained >>> devices. >>> >>> Is there any requirement in MSE to keep the data in memory and not on >>> disk? >>> >> >> [acolwell] That really is beside the point. Sure disk could be used, but >> in the mobile or TV case that isn't likely to be an option. Having disk >> just delays the problem a little longer. Eventually the UA may need to >> evict part of the timeline covered by the file and you'd have to reappend >> the whole file again to get that region back. >> >> >>> >>> If the UA decides to garbage collect part of the presentation timeline >>> to free up space for new appends it is not clear how the web application >>> could reappend the garbage collected regions without appending the whole >>> file again. >>> >>> You could tell the same thing about non-fragmented files if the fragment >>> is very long. Sure, you will more likely find large non-fragmented files >>> than fragmented files with large fragments but the problem is the same. >>> Small non-fragmented files (such as small clips, ads) should not be >>> excluded. >>> >> >> [acolwell] I agree, but it is much more likely for people to create large >> & long non-fragmented MP4 files than it is for people to create fragmented >> files with large fragments. Forcing the conversion to fragmented form >> forces this issue to be considered. Another issue is that "small clips" is >> a very subjective thing and once we say non-fragmented MP4 is supported >> people will expect it to work no matter how long the files are. >> >> >>> >>> >>> The fragmented form allows the application to easily select the >>> desired segment and reappend it. >>> >>> If possible! If your fragments are large, the application won't be able >>> to do it (at least not easily). >>> >> >> [acolwell] True, but that's further incentive to keep the segments >> reasonable size. If the UA keeps evicting parts of the segments that is an >> indication that the fragment size you are using is too big. At least with >> fragmented files you have this parameter to tune. With non-fragmented files >> you are simply out of luck and have to convert to fragmented form to >> resolve it anyways. >> >> >>> >>> >>> >>> Applications can control the level of duplicate appending by adjusting >>> the fragment size appropriately. >>> >>> I think you mean 'Content provider can control ...'? Web applications >>> should be able to use content from different sources, with no control over >>> the content generation, so possibly using non-fragmented files. >>> >> >> [acolwell] If they intend to use it the content with MSE then they need >> to be able to exert some sort of control. It may be that they have to >> convince their partners to provide their assets in fragmented form. MSE >> already puts constraints on what types of content can be spliced together >> so the application needs to have at least some idea about what it is >> passing to MSE to insure it doesn't violate any of the constraints such as >> no codec changes, consistent track counts, etc. >> >> >>> Non-fragmented files are so permissive about how they can store >>> samples, there is no simple way to collect segments of the timeline w/o >>> essentially exposing a random access file API. >>> >>> Which exact difference in the non-fragmented (vs the fragmented storage) >>> is problematic? For which situation? I don't understand what you mean by >>> 'collect segments of the timeline'. Which entity would need to do that? The >>> web application? the MSE implementation? the MP4 parser? It is certainly >>> easy for the MP4 parser. >>> >> >> [acolwell] Non-fragmented files have a lot of options when it comes to >> how the file is formatted. Is the moov at the beginning or end? Are the >> samples stored in the mdat in order or are they randomly distributed? The >> list goes on. This adds a lot of complexity and in the worst case requires >> the whole file to be available to resolve. In fragmented files this >> complexity is relatively bounded and the content author actually has to be >> proactive about making sure the content is in the right format. Sure people >> can create crazy fragmented files as well, but that is not nearly as common. >> >> >>> >>> In general, I think MSE can be viewed as an API to construct a playlist >>> and have seamless playback of the elements of the playlist in an HTML5 >>> video element. There are complex playlist configurations with overlapping >>> elements. I think the simple use case of seamlessly playing 2 MP4 files >>> sequentially should be supported. >>> >> >> [acolwell] MSE is not an API to construct playlists and I don't think it >> is good to think about it this way. If that was my goal then I would have >> designed things very differently. MSE is an API to construct a presentation >> from a set of media segments. Media segments do not necessarily imply fully >> formed files in existing formats. Certain forms of existing file formats >> are interpreted as media segments by MSE, but fully formed files are not >> required to add media segments to the presentation. For example in the WebM >> bytestream all you need to create is a Cluster element to add media to the >> presentation. You don't need to create a fully formed WebM file. For ISO, >> you only need a moof box followed by an mdat box to add media. I explicitly >> wanted to break the constraints of requiring fully formed files because I >> believe it would allow people to mashup content in ways that would be >> difficult within the constraints of existing file formats. >> >> [acolwell] Seemlessly playing 2 MP4 files sequentially is supported if >> you first convert them to fragmented form. In my opinion it is better to >> have the content author place the content in a specific form then to >> require all UAs to have to deal with non-fragmented MP4 files. >> >> Aaron >> >> > >
Received on Wednesday, 20 February 2013 17:42:01 UTC