- From: Davy Van Deursen <davy.vandeursen@ugent.be>
- Date: Sat, 1 Nov 2008 22:52:01 +0100
- To: "'Media Fragment'" <public-media-fragment@w3.org>
Hi Silvia, all, >-----Original Message----- >From: public-media-fragment-request@w3.org [mailto:public-media- >fragment-request@w3.org] On Behalf Of Silvia Pfeiffer >Sent: Saturday, November 01, 2008 12:00 PM >To: Media Fragment >Subject: Re: Squid experts > > >Hi Davy, > >That's a very clear statement on the possibilities of a abstract model >of the structure-to-binary relationships of compressed media >resources. I think you may be right and it's not easily possible, if >not impossible, to do with all media types - even though >multi-byte-ranges may help for some of them. Whether that's a killer >for this approach, or whether we could still suggest this approach as >an optimisation in certain cases, I don't know. If we want to support caching of media fragments without modifying the existing Web caches and proxies, then this will only work under the following circumstances (and suppose multi-byte-ranges are possible): 1) the media fragments can be extracted in the compressed domain 2) no syntax element modifications in the bitstream are needed to perform the extraction Note that one workaround for point 2 is that the server sends the headers with the modified syntax elements to the client (as you did with the Ogg format). However, I don't think that this workaround will work in general for each format. For example, extracting a spatial fragment from a Motion JPEG2000 stream implies that syntax element modifications are necessary in each JPEG2000 frame. The workaround only works when syntax element modifications are needed in headers applying to the whole bitstream. > >Which formats did you find so far it was possible to gain a structure >about? I can certainly say for Ogg that time fragments, tracks and >named fragments when using CMML are all possible. For spatial >fragments, I am not so sure - I'd think rather not... Tracks ****** Whether tracks are supported or not depends on the container format. Since a container format only defines a syntax and does not introduce any compression, it is always possible to describe the structures of a container format. Hence, if a container format allows the encapsulation of multiple tracks, then it is possible to describe the tracks in terms of bytes. Examples of such container formats are Ogg, MP4, ... Note that it is possible that the tracks are multiplexed, implying that a description of one track consists of a list of byte ranges. Also note that the extraction of tracks (and fragments in general) from container formats often introduces the necessity of syntax element modifications in the headers. Time fragments ************** If time fragments are supported or not is in the first place dependent on the coding format and more specifically how encoding parameters were set. For video coding formats, time fragments can be extracted if the video stream provides random access points (i.e., a point that is not dependent on previously encoded video data, typically corresponding to an intra-coded frame) on a regular basis. I think this is the same for audio coding formats (I only have experience with AAC and MP3); the audio stream needs to be accessed at a point where the decoder can start decoding without the need of previously coded data. Spatial fragments ***************** This one is probably the hardest to deal with and depends, just like time fragments, on the coding format. For image coding formats, JPEG2000 and HD Photo (to some extent) provide support to independently encode spatial regions. With other image coding formats such as JPEG, GIF, and PNG, it is not possible to describe spatial regions in terms of bytes. For video coding formats, we can consider the motion variants of the image coding formats JPEG2000 and HD Photo. Further, H.264/AVC and its scalable extension SVC are able to encode spatial regions independently by making use of the coding tool Flexible Macroblock Ordering (FMO). MPEG-4 Visual allows to code objects independently of each other (i.e., object-based video coding). However, for video formats, I think that there are very few media resources in the wild that were encoded with provisions for spatial fragment extraction in the compressed domain. For example, only a few H.264/AVC decoders support FMO and I never saw an MPEG-4 Visual encoded bitstream which was encoded in objects. Named fragments *************** To the best of my knowledge, no coding format provides support for named fragments. I think we should look at container formats for this feature. As you said, Ogg combined with CMML does the trick. In fact, if a container format allows the insertion of metadata describing the named fragments, then the container format supports named fragments. For example, you can include a CMML description in an MP4 container and interpret this CMML description to extract fragments based on a name. Finally, it is important to remark that coding formats supporting fragment extraction in the compressed domain are not enough. The right encoding parameters also need to be enabled to support this fragment extraction. For example, it is impossible to extract a spatial fragment from a JPEG2000 image in the compressed domain if this spatial fragment is not independently coded from the rest of the image. Best regards, Davy -- Davy Van Deursen Ghent University - IBBT Department of Electronics and Information Systems Multimedia Lab URL: http://multimedialab.elis.ugent.be
Received on Saturday, 1 November 2008 21:52:43 UTC