Re: timing model of the media resource in HTML5

Thought I should share the first feedback that I got:

A first comment I got from actual HTML5 video element implementers of
two browser vendors: browsers aren't designed to work with and
synchronise multiple sources of audio and video files.

For example, dealing with network latency, buffering, the server going
offline, seeking, not to forget cross domain security issues,
specifying behaviour if resources aren't available, aren't the same
duration, events for things like stalling on individual files, etc -
they are all too hard to solve with current browser technology.

So, it seems to me we have to restrict ourselves for this version of
HTML to dealing with multi-track audio-visual resources only where
they are provided in a single file, since otherwise we may create a
specification that nobody will implement and we don't win with that.


A second comment was that the source elements are currently regarded
as mutually exclusive elements, so the approach of regarding them as
tracks that are potentially adding to each other won't work. I might
need to make a new specification where the manifest is inside the
<source> elements.


Cheers,
Silvia.

On Wed, Nov 25, 2009 at 12:31 PM, Silvia Pfeiffer
<silviapfeiffer1@gmail.com> wrote:
> Hi all,
>
> Just to follow up on this already very intensive reading material, I
> have now done a more concrete post that takes the ideas from the first
> post and applies it to the video element.
>
> See http://blog.gingertech.net/2009/11/25/manifests-exposing-structure-of-a-composite-media-resource/
> .
>
> It is a long post, so you will need to have some patience, but I would
> appreciate feedback.
>
> The idea this time is to extend the usefulness of the existing
> "source" elements rather than introducing new elements such as "itext"
> to provide us with the functionality of multi-track media resources
> (even as they are virtual media resources in the meaning defined in
> the first blog post). An example would be:
>
>  <video>
>    <source src='video.ogv' type='video/ogg' media='desktop' lang='en'
>                     role='media' >
>    <source src='video.ogv?track=auddesc[en]' type='audio/ogg' lang='en'
>                     role='auddesc' >
>    <source src='audiodesc_de.oga' type='audio/ogg' lang='de'
>                     role='auddesc' >
>    <source src='video.mp4?track=caption[en]' type='application/ttaf+xml'
>                     lang='en' role='caption' >
>    <source src='video.ogv?track=caption[de]' type='text/srt;
> charset="ISO-8859-1"'
>                     lang='de' role='caption' >
>    <source src='caption_ja.ttaf' type='application/ttaf+xml' lang='ja'
>                     role='caption' >
>    <source src='signvid_ase.ogv' type='video/ogg; codecs="theora"'
>                     media='desktop' lang='ase' role='sign' >
>    <source src='signvid_gsg.ogv' type='video/ogg; codecs="theora"'
>                     media='desktop' lang='gsg' role='sign' >
>    <source src='signvid_sfs.ogv' type='video/ogg; codecs="theora"'
>                     media='desktop' lang='sfs' role='sign' >
>  </video>
>
> which is a composite virtual media resource with two audio description
> tracks, three caption tracks and three sign language video tracks.
>
> There are some issues raised with that approach in the new post and I
> am looking for feedback on how to potentially solve them.
>
> Best Regards,
> Silvia.
>
>
> On Mon, Nov 23, 2009 at 1:02 PM, Silvia Pfeiffer
> <silviapfeiffer1@gmail.com> wrote:
>> Hi all,
>>
>> I'd like to start discussions about accessibility in media elements
>> for HTML5 by going all the way back and answering the fundamental
>> question that Dick Bulterman posed at the recent (well, not so recent
>> any more) Video Accessibility workshop. He stated that HTML5 hasn't
>> got a timing model for the media elements and that a discussion about
>> the timing model needs to be had.
>>
>> To start off this discussion, I have written a blog post that explains
>> where I think things are at. It has turned out to be a rather long
>> blog post, so I'd rather not copy and paste it into the discussion
>> here. You can read it at
>> http://blog.gingertech.net/2009/11/23/model-of-a-time-linear-media-resource/
>> .
>>
>> If you disagree/agree/want to discuss any of the things I stated
>> there, please copy the relevant paragraph and quote it into this
>> thread, so we can all know what we are discussing. (I guess, Google
>> Wave would come in hand here..)
>>
>> As a three sentence summary:
>> Basically, I believe that the 90% use case for the Web is that of a
>> time-linear media resource. Any other, more complex needs, that
>> require multiple timelines can be realised using JavaScript and the
>> APIs to audio and video that we still need to define and that will
>> expose companion tracks to the Web page and therefore to JavaScript. I
>> don't believe that there will be many use cases that such a
>> combination cannot satisfy, but if there are, one can always use the
>> "object" tag and use external plugins to render the Adobe Flash,
>> Silverlight or SMIL experience to produce this.
>>
>> BTW: talking about SMIL - I would be very curious to find out if
>> somebody has tried implementing SMIL in HTML5 and JavaScript yet. I
>> think much of what a SMIL file defines should now be able to be
>> presentable in a Web Browser using existing HTML5 and JavaScript
>> constructs. It would be an interesting exercise and I'd be curious to
>> hear if somebody has tried and where they found limitations.
>>
>> Best Regards,
>> Silvia.
>>
>

Received on Wednesday, 25 November 2009 02:53:19 UTC