Re: Roll-up captions in WebVTT

On Tue, Apr 10, 2012 at 3:22 AM, Silvia Pfeiffer
<silviapfeiffer1@gmail.com>wrote:

> > and credits seem like they would be handled differently than
> > roll-ups.
>
> I don't see why credits would be different. They are scrolling text
> over a certain timeline.
>

Scrolling credits should be represented with presentational markup, whereas
roll-ups should be handled with semantic markup.

That is, credits should probably be a cue that says "this cue appears below
the screen, then scrolls above the top of the screen over the course of 5
seconds", which is presentational.  Roll-ups should happen by cues
indicating which side of the screen they're on, and the UA figuring out the
rest.

 >  Karaoke is a whole beast by itself (per-word timing for
> > highlighting fragments as the song is sung)
>
> Per-word timing is already solved with the timestamps <00:00:00.000>,
> so you can already do Karaoke nicely with the kind=subtitle, with
> these timestamps, and with the ::before and ::after CSS
> pseudo-selectors. It's therefore just a question whether you want text
> lines scrolling or whether you want them fixed.
>

Karaoke is often much more complex than that; for example, often words are
highlighted smoothly, with slower words taking longer to become fully
highlighted.  (I'm not a fan of this sort of thing--it's just
distracting--but it's common.)  I wouldn't go near that sort of stuff.


> > it's an interesting case, at
> > least.  FWIW, I agree with Ian that trying to "faithrully represent"
> legacy
> > content doesn't seem worthwhile in and of itself, even if that happens
> to be
> > inconvenient to people stuck in contracts.
>
> A best effort should be made where all the features are at least
> possible, even if not 100% identical. At the moment even a
> near-faithful representation requires copying of text or a special
> separate JS rendering approach.
>

Only if the features are legitimately useful.  If a feature is a bad idea,
then "but TV captions do it, so we need to too" is a poor case.

 > But regarding "in the future"--we can always add new features in the
> future.
>
> With that statement you have just excluded YouTube from moving to
> using WebVTT for HTML5 captions. And even though YouTube should not be
> the only use case that we regard, it certainly is the biggest caption
> user online, so excluding them seems counterproductive.
>

If YouTube wants a feature, they should come on the list and talk about
their use cases.  Web standards aren't developed based on "(some big
website) is threatening to not use our format unless we give them this".

 Interesting. We should explore that further.
> Note that in CEA 708 we have 9 actual locations for rendering captions
> on video. Might be that 4 are sufficient for grouping. What about the
> center?
>

I'd leave it out unless there are use cases for it, since it can always be
added later.

> I don't find explicitly grouping cues together objectionable.  What I
> don't
> > like is the idea of markup that says "these cues should be rendered as
> > roll-up captions".
>
> I'm providing semantic markup that opens up the possibility to render
> in a roll-up way, not a prescription to rendering them as roll-up.


Several proposals do, eg. "Cue classes", "Cue setting: Explicit Movement
Indications", and "Cue settings: introduce transitions".  Those are the
ones I'm strongly against (at least for the roll-up use case), because
they're much more likely to fail in different rendering conditions, cause
breakage if they're ignored due to user preferences, and they don't work
consistently (if you *do* want roll-ups, you always want them, not just
when the author decided to make it possible).  The more semantic
approaches, like marking captions with "top" or "bottom", will be much more
robust.

-- 
Glenn Maynard

Received on Tuesday, 10 April 2012 22:20:09 UTC