Re: video use-case from Silvia Pfeiffer on 2008-10-06 (public-media-fragment@w3.org from October 2008)

From: Silvia Pfeiffer <silviapfeiffer1@gmail.com>
Date: Tue, 7 Oct 2008 10:54:33 +1100
To: "Dave Singer" <singer@apple.com>
Cc: "Media Fragment" <public-media-fragment@w3.org>
Message-ID: <2c0e02830810061654h6149fba9wc20cfb354022c358@mail.gmail.com>
On Tue, Oct 7, 2008 at 10:26 AM, Dave Singer <singer@apple.com> wrote:
> At 9:16  +1100 7/10/08, Silvia Pfeiffer wrote:
>>>  For example, in a media file that has an index and is
>>>  in time-order, a user-agent wanting a time-based subset may be able to
>>> use
>>>  byte-range requests to get the index, and then the part(s) of the file
>>>  containing the desired media. (We do this today for MP4 and MOV files).
>>
>> Yes, byte-ranges are possible. However, the Web server is the only
>> component in the system that knows about converting a time offset to a
>> byte range. Therefore, you have to first communicate with a URI
>> reference to the server which subsegment you would like to have, the
>> server can then convert that to byte ranges, return to the UA which
>> byte range he has to request, and then we can do a normal byte-range
>> request on the full URI.
>>
>> When you say that you do this today for MP4 and MOV files, how do you
>> communicate the fragment to the Web server?
>
> MP4 and MOV files have tables in the moov atom which give complete time and
> byte-offset indexing for every video and audio frame.  Atoms are also sized.
>  You can gamble, ask for 1K at the start of the file; if it's laid out for
> incremental playback, the moov atom will be the first or second atom;
>  you've got its size now, and can download the rest.  If it wasn't, you have
> the size of those atoms and can skip past them and ask for the next.  Once
> you have the moov atom, you know exactly what bytes you need to go anywhere
> in time (and yes, even sync points are marked, so you know how far to back
> up if someone does a random access).  If video and audio are interleaved in
> time order, the data you need will be all contiguous.

This still does not solve the client-server problem. Say, a UA wants
to play back a MOV file from sec 45-88. The UA does not know how to
map that to a byte offset and therefore to a byte-range request. the
UA has to ask the server for this information. The server can ask the
local MOV file for the byte mapping from the time mapping by analysing
the tables in the moov atom as you described. Then it can tell the UA
this information which in turn can do a byte-range request.

What we are talking about with a temporal media fragment request
through a URI is the very first step: the UA needs to request from the
Web server the media fragment.


> This is, of course, very difficult for VBR un-indexed files.  Pretty easy,
> of course, for CBR files.

As long as it is all handled by the Web server, it's no different in
process. The mapping to byte ranges may be as complicated as possible
- the UA and the network don't care, they just handle whatever they
are being told by the Web server.



>>>  I think it would be fairly easy, however, to do server-based
>>> time-trimming
>>>  of, for example, RTSP/RTP-based streams.
>>
>> It also works for HTTP.
>
> Ah, this is easy(er) for incremental file formats without any 'directory'
> like the moov atom, much harder for MP4 and MOV files. But really it's only
> truly easy for fixed bit-rate files.

As described above, it should work for any file format, I think. There
may be an issue with needing to return multiple byte ranges and having
a header that cannot be returned and needs to be corrected. Other than
that it should work.


>>>  I rather think the same would be
>>>  true for mpeg-2 streams.  (This would be query-based).  It would also be
>>>  easy for a client to interpret fragments and do the corresponding seek
>>>  request(s) over RTSP.  I am unclear on the ownership of fragment
>>> identifiers
>>>  in RTSP, however.
>>
>> The meaning of fragement identifiers is defined by the media type and
>> not by the protocol.
>
> For any protocol?  What is the 'media type' in RTSP?  The only thing that
> comes back with any kind of recognizable types are (a) the description
> format (e.g. SDP) and (b) the individual streams (video/mpeg4, for example,
> is the MIME name of the RTP mpeg4 payload format).  Which one 'owns' the
> fragment on the bundle that the URL represents.

The media type is the MIME type of the resource that is returned. In
RTSP we don't actually have real "video" resources. We only have codec
streams. So, the meaning of "fragment" in RTSP is not the same as that
for ftp or http for example. However, I think RTSP already has
mechanisms from doing fragment addressing since it was a main part of
the standardisation of RTSP. I don't think we need to change that. We
should learn from it and consider means of harmonising in the end I
think.


Cheers,
Silvia.
Received on Monday, 6 October 2008 23:55:11 UTC