Re: Byte ranges and time ranges from Yves Lafon on 2008-11-13 (public-media-fragment@w3.org from November 2008)

From: Yves Lafon <ylafon@w3.org>
Date: Wed, 12 Nov 2008 20:00:45 -0500 (EST)
To: Silvia Pfeiffer <silviapfeiffer1@gmail.com>
cc: Media Fragment <public-media-fragment@w3.org>
Message-ID: <Pine.LNX.4.64.0811121943440.22222@ubzre.j3.bet>
On Tue, 11 Nov 2008, Silvia Pfeiffer wrote:

> document.getElementsByTagName("video")[0].currentTime=30
>
> Assuming that part of the file had not been downloaded, what happens
> is as follows:
>
> * the request goes from the browser to liboggplay
> * liboggplay makes an educated guess at the byte offset that relates
> to this time offset based on the file size and the average bitrate of
> the file (which were received as first information about the video)

So the first handshake is there to get the beginning of the audio 
resource, to figure out the average bitrate and the size. Is it done using 
a range request, or does it start the download of the whole resource then 
subsequent requests will get content directly form the browser cache ?

> * liboggplay hands back the byte ranges to the browser
> * the browser makes a read request on these byte ranges via http
> (using these functions
> http://hg.mozilla.org/mozilla-central/file/5dfdad637696/content/media/video/src/nsMediaStream.cpp)
> * the server returns those byte ranges and the browser hands them back
> to liboggplay, which determines from the received granulepositions
> what time they relate to
> * if the requested time is not hit, liboggplay makes a better estimate
> on byte ranges and another http byte range request is sent, etc. until
> the right byte ranges are returned
>
> This is amazingly the exact same seeking algorithm that is used by
> media players that seek to time offsets in Ogg file. The only
> difference is that the seeking algorithm is now implemented over HTTP
> rather than local file I/O. If the guesses are good, less than 10
> round trips are necessary, Chris says. He also says that the delay
> introduced through these roundtrips are barely visible in the browser.
> He has tested with Wikimedia and other content and it works reliably.

No, that's not exactly the same algorithm. One thing you get for granted 
is when you open a file, you get a file descriptor and een if you 
overwrite the file, the fd still points to the old content (unless the 
disk space is overwritten, or the inode table or whatever might 
interfere).
In HTTP you don't have that. So basically you need to send a conditionnal 
request for every subsequent interaction with the server to make it work 
reliably. Is it the case there?

> Chris also says that if he can get a cooperative server which can do
> the time-byte-mapping that we are discussion, he'd rather use the
> seeking support on the server. However, I find it amazing that it is
> working so well even without such server support!

Yes, finding a solution _now_ doesn't mean that the solution is optimal :)

> I think we can draw some conclusions from this:
>
> * when loading up a video, a couple of roundtrips don't make much of a
> difference; thinking about this further, I actually think this is the
> case because we are used to Web pages that take a long time to load
> because they have to download many resources and cannot get them all
> in parallel; we are also used to videos sitting in the browser
> buffering because the bandwidth is not big enough; in comparison two
> roundtrips for video are really nothing

Well, there is work to deliver HTTP over SCTP, allowing for far more 
parallelism in resource fetching, also what seemed "good enough latency" 
ten years ago is now unacceptable. Note also that it depends on a lot of 
things, including the network latency and speed.

> * asking for byte ranges alone can work.

As well as defining sub-URIs for each seconds and retreiving an index of 
all seconds->links relations :)

-- 
Baroula que barouleras, au tiéu toujou t'entourneras.

         ~~Yves
Received on Thursday, 13 November 2008 01:00:55 UTC