- From: Philip Jägenstedt <philipj@opera.com>
- Date: Tue, 02 Feb 2010 17:19:59 +0100
- To: "Silvia Pfeiffer" <silviapfeiffer1@gmail.com>
- Cc: "Eric Carlson" <eric.carlson@apple.com>, "HTML Accessibility Task Force" <public-html-a11y@w3.org>, "Ken Harrenstien" <klh@google.com>
On Tue, 02 Feb 2010 14:21:20 +0100, Silvia Pfeiffer <silviapfeiffer1@gmail.com> wrote: > On Wed, Feb 3, 2010 at 12:17 AM, Philip Jägenstedt <philipj@opera.com> > wrote: >> On Tue, 02 Feb 2010 14:08:36 +0100, Silvia Pfeiffer >> <silviapfeiffer1@gmail.com> wrote: >> >>> On Wed, Feb 3, 2010 at 12:03 AM, Silvia Pfeiffer >>> <silviapfeiffer1@gmail.com> wrote: >>>> >>>> On Tue, Feb 2, 2010 at 7:19 PM, Philip Jägenstedt <philipj@opera.com> >>>> wrote: >>>>> >>>>> On Mon, 01 Feb 2010 23:30:19 +0100, Silvia Pfeiffer >>>>> <silviapfeiffer1@gmail.com> wrote: >>>>> >>>>>> On Tue, Feb 2, 2010 at 3:59 AM, Eric Carlson >>>>>> <eric.carlson@apple.com> >>>>>> wrote: >>>>>>> >>>>>>> On Feb 1, 2010, at 4:19 AM, Silvia Pfeiffer wrote: >>>>>>> >>>>>>> On Fri, Jan 29, 2010 at 12:39 AM, Philip Jägenstedt >>>>>>> <philipj@opera.com> >>>>>>> wrote: >>>>>>> >>>>>>> On Wed, 27 Jan 2010 12:57:51 +0100, Silvia Pfeiffer >>>>>>> >>>>>>> If we buried the track information in a javascript API, we would >>>>>>> >>>>>>> introduce an additional dependency and we would remove the ability >>>>>>> to >>>>>>> >>>>>>> simply parse the Web page to get at such information. For example, >>>>>>> a >>>>>>> >>>>>>> crawler would not be able to find out that there is a resource with >>>>>>> >>>>>>> captions and would probably not bother requesting the resource for >>>>>>> its >>>>>>> >>>>>>> captions (or other text tracks). >>>>>>> >>>>>>> Surely, robots would just index the resources themselves? >>>>>>> >>>>>>> Why download binary data of indeterminate length when you can >>>>>>> already >>>>>>> get it out of the text of the Web page? Surely, robots would >>>>>>> prefer to >>>>>>> get that information directly out of the Webpage and not have to go >>>>>>> and download gazillions of binary media files that they have to >>>>>>> decode >>>>>>> to get information about them. >>>>>>> >>>>>>> Right now, everybody who sees a video element in a HTML5 page >>>>>>> simply >>>>>>> assumes that it consists of a video and a audio track and has no >>>>>>> other >>>>>>> information in it. This is fine in the default case and in the >>>>>>> default >>>>>>> case no extra resource description is probably necessary. But when >>>>>>> we >>>>>>> actually do have a richer source, we need to expose that. >>>>>>> >>>>>>> This argument leads down a very slippery slope. If it is crucial >>>>>>> to >>>>>>> include caption information in markup for spiders, what about other >>>>>>> media >>>>>>> file metadata that a crawler might want to read - intrinsic width >>>>>>> and >>>>>>> height, duration, encoding format, file size, bit rate, frame rate, >>>>>>> etc, >>>>>>> etc, etc? Robots may prefer to have all of this in the page do they >>>>>>> don't >>>>>>> have to load and parse the file, but I don't think it is necessary >>>>>>> or >>>>>>> appropriate. >>>>>> >>>>>> Not quite. >>>>>> >>>>>> It is a difference if you are a web crawler that wants to collect >>>>>> captions or one that wants to collect such file metadata. For file >>>>>> metadata, you are bound to always be successful when parsing the >>>>>> header of a binary file. So, I agree there with you. >>>>>> >>>>>> But if you are only keen on captions, you are bound to often parse >>>>>> useless information if you have to download the media file header. A >>>>>> hint inside the markup that there are captions/subtitles there and >>>>>> that it is useful to parse the file - and then parse it fully - is >>>>>> very relevant. >>>>> >>>>> Even if all browser vendors should agree that this is useful and >>>>> implemented >>>>> the suggested track markup, it will only be used by authors in very >>>>> rare >>>>> situations -- when they want to populate the browser's context menu >>>>> before >>>>> HAVE_METADATA. As most videos that have multiple audio/video/text >>>>> tracks >>>>> won't be marked up as such in HTML, robots will still have to >>>>> download >>>>> the >>>>> headers of all videos to see if they have captions. If they want to >>>>> index >>>>> the captions (not just the fact that they exist), they'll also have >>>>> to >>>>> download the whole file. >>>> >>>> I still believe it's useful to expose the tracks in a media file to >>>> the browser and to automated tools without having to use javascript to >>>> get to them or having to download the media data and decode the >>>> headers. >>>> >>>> But I don't think any browser vendors will want to implement it at >>>> this stage, so I concede. >>>> >>>> Let's instead focus on getting the JavaScript API right and get to a >>>> state where we can at least make use of such multitrack media files. >>>> >>>> I have put Eric's proposal with some slight changes (replace "type" >>>> with "role" in the examples, added a "role" attribute, added a "name" >>>> attribute, added a namedItem accessor: >>>> http://www.w3.org/WAI/PF/HTML/wiki/Media_MultitrackAPI >>>> >>>> I'd say everyone should free to edit that page as they see fit, but >>>> leave a comment on the mailing list as to why the changes were >>>> necessary. >>> >>> Philip: you mentioned >>> http://www.w3.org/TR/mediaont-api-1.0/#webidl-for-api . Do you think >>> the track elements should have some of these characteristics, too, and >>> expose them? >> >> I quite like Eric's suggestion of exposing this interface on both on the >> media element and on each track. The interface isn't as good as it >> could be >> yet (e.g. throwing NoValue isn't going to fly, just return undefined) >> but I >> do think we can reuse this and implement as much of it as possible. >> >> I've already sent some feedback to the Media Annotations WG, but a lot >> more >> is needed if we want to use this. > > Is there a similar API for images that we could compare it with to > evaluate? Not that I know of, for images no metadata at all is exposed except the width and height. -- Philip Jägenstedt Core Developer Opera Software
Received on Tuesday, 2 February 2010 16:20:59 UTC