Re: video use-case from Dave Singer on 2008-10-06 (public-media-fragment@w3.org from October 2008)

From: Dave Singer <singer@apple.com>
Date: Mon, 6 Oct 2008 16:26:51 -0700
To: Media Fragment <public-media-fragment@w3.org>
Message-Id: <p06240810c5104d85fe85@[10.0.1.8]>
At 9:16  +1100 7/10/08, Silvia Pfeiffer wrote:
>Hi Dave, all,
>  > The fact that they are interpreted by the UA does NOT mean
>>  that the entire resource is automatically downloaded and then the fragment
>>  interpreted, however.
>
>This is a point of contention. By definition of URI fragements, the
>fragment identifier is not being communicated to the Web server.
>Instead, it is only interpreted locally by the UA. I have had this
>discussion extensively with the URI working group as we were trying to
>formalise temporal URIs
>(http://lists.w3.org/Archives/Public/uri/2003Jul/0016.html).
>
>Therefore, using only URIs and HTTP, there is no other means of
>communicating the fragment specifiction to the server than using the
>"?" query component of URIs.
>
>
>>  For example, in a media file that has an index and is
>>  in time-order, a user-agent wanting a time-based subset may be able to use
>>  byte-range requests to get the index, and then the part(s) of the file
>>  containing the desired media. (We do this today for MP4 and MOV files).
>
>Yes, byte-ranges are possible. However, the Web server is the only
>component in the system that knows about converting a time offset to a
>byte range. Therefore, you have to first communicate with a URI
>reference to the server which subsegment you would like to have, the
>server can then convert that to byte ranges, return to the UA which
>byte range he has to request, and then we can do a normal byte-range
>request on the full URI.
>
>When you say that you do this today for MP4 and MOV files, how do you
>communicate the fragment to the Web server?

MP4 and MOV files have tables in the moov atom which give complete 
time and byte-offset indexing for every video and audio frame.  Atoms 
are also sized.  You can gamble, ask for 1K at the start of the file; 
if it's laid out for incremental playback, the moov atom will be the 
first or second atom;  you've got its size now, and can download the 
rest.  If it wasn't, you have the size of those atoms and can skip 
past them and ask for the next.  Once you have the moov atom, you 
know exactly what bytes you need to go anywhere in time (and yes, 
even sync points are marked, so you know how far to back up if 
someone does a random access).  If video and audio are interleaved in 
time order, the data you need will be all contiguous.

This is, of course, very difficult for VBR un-indexed files.  Pretty 
easy, of course, for CBR files.

>
>
>>  Query identifiers follow a "?" and are interpreted by the server. The syntax
>>  and semantics are defined by the server you are using.  To the UA, the
>>  result is "the resource".  I'm not aware of servers that can do time-based
>>  trimming of media files, but it's certainly possible that they exist.
>
>Annodex does this. Metavid is making extensive use of this:
>http://metavid.org/wiki/Main_Page

doh!  of course, my mistake.

>
>>  I think it would be fairly easy, however, to do server-based time-trimming
>>  of, for example, RTSP/RTP-based streams.
>
>It also works for HTTP.

Ah, this is easy(er) for incremental file formats without any 
'directory' like the moov atom, much harder for MP4 and MOV files. 
But really it's only truly easy for fixed bit-rate files.

>
>>   I rather think the same would be
>>  true for mpeg-2 streams.  (This would be query-based).  It would also be
>>  easy for a client to interpret fragments and do the corresponding seek
>>  request(s) over RTSP.  I am unclear on the ownership of fragment identifiers
>>  in RTSP, however.
>
>The meaning of fragement identifiers is defined by the media type and
>not by the protocol.

For any protocol?  What is the 'media type' in RTSP?  The only thing 
that comes back with any kind of recognizable types are (a) the 
description format (e.g. SDP) and (b) the individual streams 
(video/mpeg4, for example, is the MIME name of the RTP mpeg4 payload 
format).  Which one 'owns' the fragment on the bundle that the URL 
represents.

>
>It's a shame that we have to deal with every
>protocol-media-type-combination individually to solve the URI
>addressing of media fragments, but that is the world we find ourselves
>in.

sigh, yes.
-- 
David Singer
Apple/QuickTime
Received on Monday, 6 October 2008 23:28:25 UTC