Re: timing model of the media resource in HTML5

On Feb 1, 2010, at 4:19 AM, Silvia Pfeiffer wrote:

> On Fri, Jan 29, 2010 at 12:39 AM, Philip Jägenstedt <philipj@opera.com> wrote:
> 
>> On Wed, 27 Jan 2010 12:57:51 +0100, Silvia Pfeiffer
>>> If we buried the track information in a javascript API, we would
>>> introduce an additional dependency and we would remove the ability to
>>> simply parse the Web page to get at such information. For example, a
>>> crawler would not be able to find out that there is a resource with
>>> captions and would probably not bother requesting the resource for its
>>> captions (or other text tracks).
>> 
>> Surely, robots would just index the resources themselves?
> 
> Why download binary data of indeterminate length when you can already
> get it out of the text of the Web page? Surely, robots would prefer to
> get that information directly out of the Webpage and not have to go
> and download gazillions of binary media files that they have to decode
> to get information about them.
> 
> Right now, everybody who sees a video element in a HTML5 page simply
> assumes that it consists of a video and a audio track and has no other
> information in it. This is fine in the default case and in the default
> case no extra resource description is probably necessary. But when we
> actually do have a richer source, we need to expose that.
> 
  This argument leads down a very slippery slope. If it is crucial to include caption information in markup for spiders, what about other media file metadata that a crawler might want to read - intrinsic width and height, duration, encoding format, file size, bit rate, frame rate, etc, etc, etc? Robots may prefer to have all of this in the page do they don't have to load and parse the file, but I don't think it is necessary or appropriate.

eric

Received on Monday, 1 February 2010 16:59:50 UTC