Re: Absolute region positioning (was Re: Alternative approach to scrolling, with demos) from Christian Vogler on 2014-05-05 (public-texttracks@w3.org from May 2014)

From: Christian Vogler <christian.vogler@gallaudet.edu>
Date: Mon, 5 May 2014 09:17:26 -0400
To: Silvia Pfeiffer <silviapfeiffer1@gmail.com>
Cc: Philip Jägenstedt <philipj@opera.com>, "public-texttracks@w3.org" <public-texttracks@w3.org>
Message-ID: <CAHVQVp3bHE2q8+5c2ztpZ-OijLTacjxUOxnHr=5pHRmdfTac4w@mail.gmail.com>
Just quickly jumping in from the standpoint of someone who depends on
captions for accessing videos: I think there is some leeway in what can be
considered identical to TV captions - the big needs are positioning for
speaker identification, positioning to avoid obscuring important portions
of the image, colors, fonts, etc. The operative word in the FCC rules is
actually that the Internet captions have to be at least of equal quality as
on TV, but it is perfectly fine if some improvement in WebVTT actually
results in better-than-TV captioning. So, I'd not try to look at
pixel-perfect identity, but rather at intent.

I have not been following this discussion very closely, because of
end-of-the-semester craziness, and so I am not fully up to speed, but what
strikes me about this thread is the same concern that Silvia just
expressed: that we may be over-engineering. I understand that this
discussion came about because we want to unify the rendering paths, but
that the proposed approach may have some ramifications for rendering
CEA608/708 cues?

I wonder if it would be helpful to work through the problem cases that are
driving this discussion and look at them with a fresh eye, to determine if
they are really a problem per se, or if these are things that no end user
will care about.

I'll also add that one thing I do care about are semantics. I.e., if a
particular scrolling area or caption color identifies a particular speaker,
an author should be able to make that kind of association in the WebVTT
file, and it should not be merely an artifact of the rendering.

Best wishes
Christian


On Mon, May 5, 2014 at 8:50 AM, Silvia Pfeiffer
<silviapfeiffer1@gmail.com>wrote:

> On Mon, May 5, 2014 at 10:16 PM, Philip Jägenstedt <philipj@opera.com>
> wrote:
> > On Mon, May 5, 2014 at 1:38 PM, Silvia Pfeiffer
> > <silviapfeiffer1@gmail.com> wrote:
> >> On Mon, May 5, 2014 at 9:17 PM, Philip Jägenstedt <philipj@opera.com>
> wrote:
> >>> On Tue, Apr 8, 2014 at 9:21 AM, Silvia Pfeiffer
> >>> <silviapfeiffer1@gmail.com> wrote:
> >>>> On Sun, Mar 30, 2014 at 4:41 PM, Philip Jägenstedt <philipj@opera.com>
> wrote:
> >>>>>
> >>>>> == Absolute positioning and scrolling ==
> >>>>>
> >>>>> Demo:
> http://people.opera.com/philipj/2014/03/vttscroll/absolute.html
> >>>>>
> >>>>> Finally, an idea for how scrolling might work with absolutely
> >>>>> positioned cues. You simply position all the cues at the point where
> >>>>> scrolling should begin, and they'll scroll up from there.
> >>>>
> >>>> I've attached a vtt file for you with two "regions". The cues should
> >>>> each get limited to their "region".
> >>>>
> >>>> You can see for yourself how this breaks down with 2 regions present.
> >>>>
> >>>> I think it's quite clear that the concept of cue groups codified in
> >>>> regions is a more appropriate approach than building ad-hoc groups of
> >>>> captions by trying to group them based on them overlapping each other.
> >>>
> >>> For simplicity in the demo, I scrolled all the cues by the same amount
> >>> without checking if they actually overlap, which of course falls apart
> >>> when you have two groups of cues. If implemented properly, I don't see
> >>> that cues scrolling in different parts of the video would be a
> >>> problem. If scrolling implemented as overlap avoidance, the situation
> >>> has to be handled anyway.
> >>>
> >>> How are overlapping regions handled? Not at all AFAICT, some text
> >>> would simply be obscured.
> >>
> >> Indeed, we're not currently handling overlapping regions. I don't
> >> think we need to either, because not even CEA708 has an overlap
> >> avoidance algorithm. Instead, it specifies z-index (called "priority")
> >> so the author can determine which cue sits in front if overlap
> >> happens.
> >>
> >> I don't mind that actually, because it provides an alternative to the
> >> default cues which do overlap avoidance.
> >
> > I do mind. Why should we accept this deficiency just because CEA708
> > has it? Unless the video is entirely filled with text (unlikely) there
> > will be a way to show all of it without overlap, so why not deal with
> > the situation since WebVTT already has overlap avoidance?
>
> I don't see it as a deficiency. I actually think it's a strength
> knowing exactly where ones content ends up being rendered on screen
> without the browser interfering with positioning. What if I'd like to
> author my cues such that they overlap? Maybe it's because it's the
> lesser evil than overlapping other content on the screen? I really
> think we may be over engineering the overlap avoidance issues if we're
> trying to avoid overlap at all cost.
>
> Cheers,
> Silvia.
>
> NB: Incidentally, following CEA708 rendering as exactly as possible is
> inline with what the US law prescribes, namely identical rendering
> between TV and Online. Though, to be honest, there is a line of "rough
> identity" that I am prepared to accept and I would call overlap
> avoidance an optimisation of sorts, which is why I'm not really
> accepting this as an argument.
>
>


-- 
Christian Vogler, PhD
Director, Technology Access Program
Department of Communication Studies
SLCC 1116
Gallaudet University
http://tap.gallaudet.edu/
VP: 202-250-2795
Received on Monday, 5 May 2014 13:17:50 UTC