[whatwg] Restarting the media element resource fetch algorithm after "load" event

On Thu, 08 Oct 2009 12:10:01 +0200, Robert O'Callahan  
<robert at ocallahan.org> wrote:

> http://www.whatwg.org/specs/web-apps/current-work/#loading-the-media-resource
>
> In the resource fetch algorithm, after we reach the NETWORK_LOADED state  
> in
> step 3 which indicates that all the data we need to play the resource is  
> now
> available locally, we end the resource fetch algorithm. However, in  
> Gecko we
> have a media cache which might discard blocks of media data after we've
> reached the NETWORK_LOADED state (to make room for data for other loading
> resources). This means we might have to start fetching the resource again
> later. The spec does not seem to allow for this. Do we need to change our
> behavior, or does the spec need to change to accommodate our behavior?  
> I'd
> prefer not to change our behavior since I think to follow the spec we'd  
> need
> to pin the entire resource permanently in the cache after we reached
> NETWORK_LOADED, which could be highly suboptimal in some situations.

The spec notes that "Some resources, e.g. streaming Web radio, can never  
reach the NETWORK_LOADED state." In my understanding, you mustn't go to  
NETWORK_LOADED if you can't guarantee that the resource will remain in  
cache. Browsers with clever caching or small caches simply won't send a  
load event most of the time.

> Another issue is that it's not completely clear to me what is meant by
> "While the user agent might still need network access to obtain parts of  
> the
> media  
> resource<http://www.whatwg.org/specs/web-apps/current-work/#media-resource>..."
> What if there is data in the resource that we don't need in order to
> play through normally, but which might be needed in some special  
> situations
> (e.g., enabling subtitles, or seeking using an index), and we optimize to
> not load that data unless/until we need it? In that case would we never
> reach NETWORK_LOADED?

As I understand it, NETWORK_LOADED means that all bytes of the resource  
have been loaded, regardless of whether they will be used or not. Are  
there any formats that would actually allow not downloading parts of the  
resource in a meaningful way? Subtitles and indexes are too small to  
bother, and multiplexed audio/video tracks can hardly be skipped without  
zillions of HTTP Range requests. It seems to me that kind of thing would  
have to be done either with a server side media fragment request (using  
the 'track' dimension) or with an external audio/video track somehow  
synced to the master track (much like external subtitles).

> In general NETWORK_LOADED and the "load" event seem rather useless and
> dangerous IMHO. If you're playing a resource that doesn't fit in your  
> cache
> then you'll certainly never reach NETWORK_LOADED, and since authors can't
> know the cache size they can never rely on "load" firing. And if you  
> allow
> the cache discarding behavior I described above, authors can't rely on  
> data
> actually being present locally even after "load" has fired. I suspect  
> many
> authors will make invalid assumptions about "load" being sure to fire and
> about what "load" means if it does fire. Does anyone have any use cases  
> that
> "load" actually solves?

I agree, sites that depend on the load event sites will likely break  
randomly for file sizes that usually barely fit into the cache of the  
browser they were tested with. If browsers are conservative with bandwidth  
and only send the load event when it's true, I think we will have less of  
a problem however. Note that the load event isn't strictly needed, waiting  
for a progress event with loaded==total would achieve the same thing.  
Aesthetically, however, I think it would be strange to not have the load  
event.

-- 
Philip J?genstedt
Core Developer
Opera Software

Received on Thursday, 8 October 2009 05:32:43 UTC