[whatwg] Video proposals

(oops, this is a re-send of an email I sent only to Ian Hixie.  I keep
pressing the wrong reply button :-(  )

On 3/15/07, Ian Hickson <ian at hixie.ch> wrote:
> In the meantime, here's replies to the comments I got.
Wow.  Nice.


> On Wed, 28 Feb 2007, Anne van Kesteren wrote:
> > Opera has some internal expiremental builds with an implementation of a
> > <video> element. The element exposes a simple API (for the moment) much
> > like the Audio() object:
> >
> >   play()
> >   pause()
> >   stop()
> >
> > The idea is that it works like <object> except that it has special
> > <video> semantics much like <img> has image semantics. In markup you
> > could prolly use it as follows:
> >
> >  <figure>
> >    <video src=news-snippet.ogg>
> >      ...
> >    </video>
> >    <legend>HTML5 in BBC News</legend>
> >  </figure>
> >
> > I attached a proposal for the element and as you can see there are still
> > some open issues. The element and its API are of course open for debate.
> > We're not enforcing this upon the world ;-)
>
> I have added such an element and its corresponding API (influenced by the
> other feedback received) to the specification. Thank you for the proposal
> and implementation experience!

What are the events?  I scanned the spec for events:
begin (can this be caught by script? is that what "at the <video>
element" means?)
progress
stalled
stopped
load
abort

Is this accurate?

I noticed the stop() method is used both to stop playback and to abort
any pending download.  Is this a good idea?  Wouldn't it be simpler to
add abort() to explicitly stop the download?

Are timeout's guaranteed to be in sync with the video?  For example:
// assume myvid.position = 0;
myvid.play();
myvid.setTimeout('alert(myvid.position);', 10);

What happens?

I have a hunch most authors will care more about time till the end
than time since the 0.  I don't have any evidence for this, and don't
know why I think it.  Except for...

> ON PLAYLISTS
>
> On Mon, 30 Oct 2006, Shadow2531 wrote:
> >
> > The handler should also support some type of playlist like
> > <http://www.xspf.org/>.
>
> On Mon, 30 Oct 2006, Charles Iliya Krempeaux wrote:
> >
> > #3: Playlists.  (A single video file just won't cut it.)
>
> These were the only requests for playlists. Could you elaborate on the use
> cases for playlists? What are the needs for playlists?
>
I don't see a need for this, if other things are reasonable, we can
implement continuous playback or a playlist using something like.....

// could make a microformat to describe playlists... perhaps XOXO with an
// extra classname.  playlist could be parsed from the dom, or provided
// via some other mechanism.
var playlist = ['one', 'two', 'three'];
var current = 0;
var mainVideo = document.getElementById('myvideo');
var dummyVideo = document.getElementById('dummyvideo');
window.onload = function() {
  mainVideo = playlist[current];
};

// would *REALLY* prefer a finished event or something to tell the
difference between
// the user watching the video completely, and it having stopped, and
script or the
// user pressing a stop button.
// because otherwise, the code to tell if we've stopped looks something like:
video.stopEvent = function(callback) {
  // not sure these apply()'s do what I think they would do.
  callback = function(this) {
    if(this.state == PLAYING) {
      return callback.apply(this);
    }
  };
  setTimeout(callback.apply(this), video.length - video.position);
};
// which is pretty undesirable, as it still doesn't always do what you
would expect.
// (eg. if the time to finish changes because of a seek, or the video's src
// changes or something...)
// so assuming a stopped event existed...
mainVideo.finished = onFinishedMain;

onFinishedMain = function() {
  // uh... does the cache kick in, or do we need to clone and delete? :->
  mainVideo.src = dummyVideo.src;

  // uh.. if the src changes, does my event stick, or do I need to reassign it?
  //mainVideo.finished = onFinishedMain;
current++;

}

function queueNextVideo() {
  if(nearlyComplete(mainVideo) {
    // have the next one start downloading...
    dummyVideo.src = playlist[current+1];
  }
  if(current < playlist.length) {
    setTimeout(queueNextVideo, mainVideo.length*.20);
  }
}

function nearlyComplete(video) {
  if(video.state == PLAYING) {
    if(video.position/video.length > .80) {
      return true;
    }
  }
  return false;
}

Something like that would be pretty common, I think.  I suspect most
script authors will want to know when a video is nearing completion,
and when the user has finished watching it.  It's also a common
technique for authors to buffer images in the background, and then
swap them in when needed.  (BTW, one thing that is really annoying is
not knowing when resources [like images or scripts] failed to load.)

OHHH I see, there is a "played" range, and that you can use /that/ to
tell if the user has seen the whole thing or not...  still seems a bit
tricky to figure it out, if all you want is an event for "we've just
finished watching the whole thing now, thanks!"

What happens to the played range after we use seek()?

>
> ON FEATURES
>
> On Mon, 30 Oct 2006, Charles Iliya Krempeaux wrote:
> >
> > #5: When to pre-fetch and when NOT to pre-fetch videos (and "download"
> > it at the last possible minute).
>
> Could you elaborate on this?

My previous example makes a lame attempt at pre-fetching.  If you
change the src element to a previously dereferenced resource, how does
the cache behave?  Will we need to clone and delete nodes instead?


> > The frame capturing would be cool (and useful).
>
> Could you elaborate on the use case for this? Since the author will have
> the complete data on his end, there doesn't seem much use for actual frame
> capture on the client.

Only thing I can think of is the initial screen cap shown (or is this
different?)  Perhaps for the image to show, you can just use an <img>
in the content of the <video>. Like
<video ...>
<img class="screencap" src="screencap" />
</video>

>
> > .loop, .startpos
> > loop = false | true
> > autostart = true | false
> > startpos = 0 | specified pos
>
> Could you elaborate on the use cases for these?
Can't these be done in script?

>
> On Thu, 1 Mar 2007, Nicholas Shanks wrote:
> >
> > You may want to consider aspect ratio too:  ratio="preserve" being
> > default, ratio="1.333" could indicate 4:3 or get tricky and accept
> > "16:9" for precision reasons.
>
> Wouldn't we simply always want to use the authored size?
Do videos encode what size they are best displayed in?  I hate
entering height and width for images.

>
> On Thu, 1 Mar 2007, Benjamin Hawkes-Lewis wrote:
> >
> > Interesting. I just wanted to ask for a bit more detail on how this
> > works in practice and what it can be used for. How would this support
> > audio descriptions, captions, and subtitles? e.g. Can the captions be
> > displayed to match user preferences for fonts and so forth and exposed
> > to accessibility frameworks? Might it support any form of hyperfilm
> > (e.g. clicking on something in the film like one can click on parts of a
> > Flickr photograph, changing perspective etc) or is it intended only for
> > traditional linear video? (These capabilities look like potential
> > advantages of SMIL.)
>
> Are you requesting these features? Or just curious as to whether they are
> supported in Opera's implementation?

Timely captions could also probably be implemented in script using
clever timeouts and sniffing for the playback state.  It could
probably represented declaritively using semantic html techniques, as
well.

<div id="captions">
<a href="#myvideoelement">video</a>
<div id="view"></div>
<dl>
<dt>0</dt><dd>Fred: Watch Jane playing the piano...</dd>
<dt>400</dt><dd>*music*</dd>
</dl>
</div>


Thanks,
Ben

On 3/15/07, Ian Hickson <ian at hixie.ch> wrote:
>
> Wow, what a lot of feedback on video! I've added a <video> element, with
> basic features, but really what we need is feedback from video experts.
>
> In the meantime, here's replies to the comments I got. I haven't quoted
> all the e-mails, since many said the same thing or went in circles (well,
> they did! sorry!), but if I missed anything, let me know, and I'll address
> it separately.
>
>
> ON THE NEED FOR A <video> ELEMENT
>
> On Mon, 30 Oct 2006, Maciej Stachowiak wrote:
> >
> > The main advantages for distinguished elements would be:
> >
> > 1) Better semantics. A search engine indexing documents to find "most
> > popular videos" or the like would be able to see from the source
> > document what is embedded as a video rather than having to guess based
> > on the type or URL an <object> points to. Similarly, screen readers
> > would know that a <video> element might still be partially accessible to
> > a [deaf] user whereas <audio> would not.
> >
> > 2) Potential to define a useful common API for controlling timed media;
> > right now each plugin exposes its own different API if it exposes one at
> > all.
>
> Agreed.
>
>
> On Mon, 30 Oct 2006, Charles Iliya Krempeaux wrote:
> >
> > #2: Video players.  (This would be embedding some kind of video screen
> > in a webpage... possibly with a play/pause button, stop button, etc.)
> >
> > #4:(Static or animated) thumbnails to videos.
>
> I agree that we should provide the building blocks to build video players.
> I don't understand what you mean by #4 above though.
>
>
> On Wed, 28 Feb 2007, Anne van Kesteren wrote:
> >
> > Opera has some internal expiremental builds with an implementation of a
> > <video> element. The element exposes a simple API (for the moment) much
> > like the Audio() object:
> >
> >   play()
> >   pause()
> >   stop()
> >
> > The idea is that it works like <object> except that it has special
> > <video> semantics much like <img> has image semantics. In markup you
> > could prolly use it as follows:
> >
> >  <figure>
> >    <video src=news-snippet.ogg>
> >      ...
> >    </video>
> >    <legend>HTML5 in BBC News</legend>
> >  </figure>
> >
> > I attached a proposal for the element and as you can see there are still
> > some open issues. The element and its API are of course open for debate.
> > We're not enforcing this upon the world ;-)
>
> I have added such an element and its corresponding API (influenced by the
> other feedback received) to the specification. Thank you for the proposal
> and implementation experience!
>
>
> On Wed, 28 Feb 2007, James Justin Harrell wrote:
> >
> > Can't such an API be provided for <object> elements that reference video?
>
> Feedback from browser vendors is that overloading <object> is hard. The
> poor state of <object> implementations tends to support this argument.
>
>
> On Sun, 4 Mar 2007, Maik Merten wrote:
> >
> > * Video support in browsers is important IMO. Otherwise the web may more
> > and more slip into dependency on Flash or similiar formats ("We have to
> > use Flash anyway for video, so why not make the whole site with
> > Flash?").
>
> Agreed.
>
>
>
> ON FALLBACK
>
> On Mon, 30 Oct 2006, Shadow2531 wrote:
> >
> > I think the <video> element should support fallback content like
> > <object>
>
> Agreed.
>
>
> On Tue, 6 Mar 2007, Elliotte Harold wrote:
> > Maik Merten wrote:
> > >
> > > Well, I guess everybody here will hate me for proposing it... and I
> > > think it's ugly... but well...
> > >
> > > <video>
> > > Perhaps a verbose description of what can be seen here?
> > > <novideo>
> > > D'oh, your browser is outdated... let's embed an <object> here
> > > </novideo>
> > > </video>
> >
> > I don't think we need a novideo element. This would work:
> >
> > <video>
> >   <p>
> >     Complete marked up transcript of the video.
> >   </p>
> > </video>
> >
> > This is much more accessible and great for search engine optimization.
>
> Agreed.
>
>
>
> ON SYNTAX
>
> On Mon, 30 Oct 2006, Shadow2531 wrote:
> >
> > I think *maybe* no attributes should count as params (only
> > param elements).
>
> Well, if we design our own element for a specific purpose (video) then we
> know what the parameters are, so we can use attributes.
>
>
> > In general, make <video> so there's only one way to do something. That
> > way you don't get:
> >
> > <video file="this"></video>
> >
> > on some pages and
> >
> > <video>
> >    <param name="file" value="this">
> > </video>
> >
> > on others.
>
> Agreed.
>
>
> On Wed, 1 Nov 2006, Charles Iliya Krempeaux wrote:
> >
> > Simplifying [the object element's type attribute] to allow type="video"
> > would make life alot easier on web developers IMO.  And alot of times,
> > when I asked web developers to do this, I didn't care what the subtype
> > was... I only cared whether it was a "video" or not.
>
> Wouldn't this be better served by just having specific elements like <img>
> and <video> that mostly ignore MIME types?
>
>
>
> ON PLAYLISTS
>
> On Mon, 30 Oct 2006, Shadow2531 wrote:
> >
> > The handler should also support some type of playlist like
> > <http://www.xspf.org/>.
>
> On Mon, 30 Oct 2006, Charles Iliya Krempeaux wrote:
> >
> > #3: Playlists.  (A single video file just won't cut it.)
>
> These were the only requests for playlists. Could you elaborate on the use
> cases for playlists? What are the needs for playlists?
>
>
>
> ON FEATURES
>
> On Mon, 30 Oct 2006, Charles Iliya Krempeaux wrote:
> >
> > #5: When to pre-fetch and when NOT to pre-fetch videos (and "download"
> > it at the last possible minute).
>
> Could you elaborate on this?
>
>
> > #6: JavaScript API for "playing", etc video.
> > #7: Scrubbing though video
>
> Agreed.
>
>
> > #8: Alternate versions.
>
> Could you elaborate on this?
>
>
> > As I'm going to mention more in my list... I'd recommend that web developers
> > can create their [own] UIs... create their own Video Players.
>
> Agreed.
>
>
> > The frame capturing would be cool (and useful).
>
> Could you elaborate on the use case for this? Since the author will have
> the complete data on his end, there doesn't seem much use for actual frame
> capture on the client.
>
>
> > Also... when implementing UIs, it's useful to have a "toggle()"
> > procedure. Something that makes it "pause" if it is "playing".  And
> > makes it "play" if it is "pausing".  Without this you have to keep track
> > of the state of the player.
>
> Interesting. I'll bear this in mind.
>
>
> On Thu, 1 Mar 2007, Shadow2531 wrote:
> >
> > [long list of desired features]
>
> I took your suggestions into account when desiging the API. I got feedback
> from a number of people (including some off-list from people who didn't
> want to express their interest publicly), some of which was contradictory,
> so the proposed API doesn't have everything you asked for. Let me know if
> there's anything that you think is missing that you really wanted.
>
>
> > .loop, .startpos
> > loop = false | true
> > autostart = true | false
> > startpos = 0 | specified pos
>
> Could you elaborate on the use cases for these?
>
>
> On Thu, 1 Mar 2007, Nicholas Shanks wrote:
> >
> > You may want to consider aspect ratio too:  ratio="preserve" being
> > default, ratio="1.333" could indicate 4:3 or get tricky and accept
> > "16:9" for precision reasons.
>
> Wouldn't we simply always want to use the authored size?
>
>
> On Thu, 1 Mar 2007, Benjamin Hawkes-Lewis wrote:
> >
> > Interesting. I just wanted to ask for a bit more detail on how this
> > works in practice and what it can be used for. How would this support
> > audio descriptions, captions, and subtitles? e.g. Can the captions be
> > displayed to match user preferences for fonts and so forth and exposed
> > to accessibility frameworks? Might it support any form of hyperfilm
> > (e.g. clicking on something in the film like one can click on parts of a
> > Flickr photograph, changing perspective etc) or is it intended only for
> > traditional linear video? (These capabilities look like potential
> > advantages of SMIL.)
>
> Are you requesting these features? Or just curious as to whether they are
> supported in Opera's implementation?
>
>
> On Thu, 1 Mar 2007, Benjamin Hawkes-Lewis wrote:
> >
> > Isn't it important that content authors know whether there will or won't
> > be an automatic UI provided, so that end users don't end up being
> > presented with two (possibly conflicting, certainly confusing) UIs?
> > That's why I suggested using an attribute to control For most use-cases,
> > I suspect the minimum functionality would not only be more than enough,
> > but superior than anything the content producer would put together. This
> > would actually make it a lot easier for ordinary HTML authors to put
> > video on the web. If we could mandate captioning and audio description
> > exposure by UAs it would make putting video on the web in an accessible
> > manner much easier too. Which would be great, as it currently seems to
> > be a somewhat complicated task.
>
> Could you elaborate on the captioning aspect?
>
> Regarding the idea of default UI, I agree that it would be useful on the
> long run. The problem is one of feature creep; with just the API we have
> already added a lot, adding a UI on top of that is asking for
> interoperability problems. Baby steps are probably wise here.
>
> I've made the spec allow a UI if it doesn't interfere with an
> author-provided one.
>
>
> On Thu, 1 Mar 2007, Spartanicus wrote:
> >
> > I strongly dislike audio and/or video that automatically downloads and
> > starts playing automatically, so much so that I've disabled media player
> > plugins altogether. Both audio and video files are often considerable in
> > size. I don't want my web browser to start making noise unless I've
> > explicitly chosen to play audio, this should not be the result of simply
> > loading a web page. I'd favour a spec requirement that a UA must offer
> > users a configuration option not to automatically download and start
> > audio and/or video and let the user decide first.
>
> I have made sure the spec handles your use case.
>
>
> On Mon, 5 Mar 2007, Kornel Lesinski wrote:
> >
> > I think it's a good idea to provide API for controlling and monitoring
> > video playback and specyfing that it should be possible to overlay HTML
> > elements on top of a movie. Probably one of the reasons for adoption of
> > Flash as a movie container is ability to create custom players, which
> > are consistent with websites' UI/branding, can add advertisements and
> > other features.
>
> Agreed.
>
>
>
> ON THE CODEC
>
> On Mon, 30 Oct 2006, Charles Iliya Krempeaux wrote:
> >
> > One of the biggest problems with video on the web (and probably video on
> > the Internet in general) right now is that there is no universally
> > supported video format.
> >
> > [good reasons why we need one codec]
> >
> > Having said all that, I believe that whatever video format is choosen
> > can NOT be encumbered.  By patents or anything else. [...]
> >
> > Given this, I would suggest Ogg Theora be the natively supported video
> > format common to all browsers.  It's designed from the beginning to be
> > unencumbed.  And implementations for it already exist under licenses
> > that should make everyone happy.
>
> A number of other people said similar things about Ogg Theora.
>
> For now, the spec says that UAs SHOULD support Theora for video and Vorbis
> for audio, and SHOULD support the Ogg container format (it's not a MUST
> because some vendors may have legal reasons why they can't or won't
> support it, and there's no point making them non-conforming when they have
> no choice in the matter).
>
>
> On Thu, 1 Mar 2007, Shadow2531 wrote:
> >
> > I think it'd be cool if the video element *just* supported theora.
>
> Supporting only one encoding is not going to fly: you can't stop browser
> vendors from adding features; and you want to allow the standard to evolve
> over time.
>
>
> On Tue, 31 Oct 2006, Lachlan Hunt wrote:
> >
> > Defining which video format for browsers to support is out of scope of
> > the WHATWG and HTML5.
>
> It doesn't have to be out of scope (HTML5 is assuming CSS and JS, for
> instance).
>
>
> On Fri, 2 Mar 2007, Gervase Markham wrote:
> >
> > I think there's a strong driver for uptake. As I understand it, all
> > these video-sharing sites are paying mountains of cash to
> > Adobe/Macromedia for the backend software licences to support Flash
> > video streaming. If they could have 15 or 20% fewer servers doing that,
> > and stream to Firefox using Theora instead, the cost saving would be an
> > incentive for them to change their site. Particularly if we implemented
> > <video> in a way which gave them all the capabilities the flash player
> > has - e.g. fast forward, rewind, seek etc.
> >
> > Of course, I don't know how those costs compare to the bandwidth bill.
>
> Henri cited my boss earlier in this thread as saying that YouTube uses
> Flash over Ogg Theora primarily due to bandwidth concerns. So...
>
>
> On Fri, 2 Mar 2007, Elliotte Harold wrote:
> >
> > But there's one capability of Flash I don't want to give them: the
> > ability to block users from easily downloading, editing, and reusing the
> > content.
> >
> > You may be right, and I hope you are, but I suspect content hording may
> > be important enough to them to justify the extra 15% or 20% cost.
>
> Google Video allows original format download if the uploader enabled it,
> so it seems that this isn't a feature that is necessarily desired.
>
>
> On Fri, 2 Mar 2007, Magnus Gasslander wrote:
> >
> > We need to be 100% sure that the format is patent free (no more GIF).
>
> It is unclear to me how we could do this.
>
>
> On Thu, 1 Mar 2007, Spartanicus wrote:
> >
> > Another current common frustration amongst authors is how to get file
> > based media files to play before they've been fully downloaded. This is
> > currently achieved by using text based redirector files containing the
> > url to the actual media file, but these redirector formats have only
> > been defined for a limited number of media formats. That would suggest
> > that a UA could by default employ progressive downloading.
>
> I have ensured the spec mentions this.
>
>
> On Sun, 4 Mar 2007, Maik Merten wrote:
> >
> > * Browser makers should negotiate on one base format. This format should
> > be free and available on all platforms. I don't say formats that need
> > patent licensing are evil by-itself, but I'm pretty sure Debian and
> > Fedora would have to remove video support from their browsers if that
> > functionality would depend on a format that needs such licensing. To my
> > knowledge only Ogg Vorbis+Theora are performing well enough and are
> > usually accepted to be "safe" and open.
>
> It's not clear they actually are performing well enough. But yes.
>
>
> > * I don't think the spec should require implementations to only support
> > one format. It should require at least one base format (see above) and
> > allow optional formats to keep track of codec development and to keep
> > political minds calm. I doubt Microsoft would ever implement a <video>
> > element if they weren't allowed to support their own formats as well (it
> > may be hard enough for them to support any base format not being theirs
> > anyway).
>
> Agreed.
>
>
>
> ON AUDIO
>
> On Mon, 5 Mar 2007, Elliotte Harold wrote:
> >
> > If we add a video element, should we for the same reasons add an audio
> > element? If not, why not?
> >
> > It seems to me these two cases are similar enough to justify similar
> > treatment. Is there any distinction between the two that would suggest
> > audio is inappropriate while video is appropriate or vice versa?
>
> (Other people made similar comments or mentioned an <audio> element in
> passing.)
>
> We already have an Audio API, I'm not sure it makes much sense to have an
> <audio> element. What's the use case? Audio is either asynchronous or
> orthogonal to the presentation in most media. We need a <video> _element_
> because in visual media, visual content has a place relative to the other
> content; but in aural media, aural content under the control of the page
> itself does not have such placement (if it's music, e.g., it can be played
> in the background, and if it's content, then you would alternate between
> playing it and playing the content of the rest of the page; in neither
> case would you simply treat the content of the media as inserted into the
> playback stream with no ability to pause it independently of the main
> document content).
>
>
>
> ON SMIL
>
> On Tue, 31 Oct 2006, Bjoern Hoehrmann wrote:
> >
> > And there I thought <video> had already been introduced in 1998.
>
> Actually the SMIL <video> element is more akin to the HTML <object>
> element than the proposal here. (SMIL <video> is defined to be
> semantically equivalent to SMIL <ref>.)
>
>
> On Wed, 28 Feb 2007, Bjoern Hoehrmann wrote:
> >
> > May I suggest Opera does not implement features that are incompatible
> > with SMIL, the SMIL implementation in Internet Explorer, and SVG for no
> > extraordinarily good reason?
>
> Could I ask you to reply to the various replies you received in response
> to the above comment? I can't really use your feedback without
> understanding it.
>
>
> On Tue, 6 Mar 2007, Charles McCathieNevile wrote:
> >
> > At which point you start heading back to "object". It seems we should
> > either take the SMIL approach and make special containers for each kind
> > of media (how many kinds? What is a flash video that has interactive
> > bits? Or an SVG that is mostly video with a few interaction choices? Or
> > interactive SVG with some audio?), or fix object...
>
> Actually the SMIL approach only has one kind of object, it just has many
> names. (As far as I can tell, at least.)
>
> --
> Ian Hickson               U+1047E                )\._.,--....,'``.    fL
> http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
> Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'
>

Received on Friday, 16 March 2007 00:27:12 UTC