Re: Support for advanced caption features (inc rollup) from Silvia Pfeiffer on 2012-12-11 (public-texttracks@w3.org from December 2012)

From: Silvia Pfeiffer <silviapfeiffer1@gmail.com>
Date: Tue, 11 Dec 2012 17:47:59 +1100
To: Ian Hickson <ian@hixie.ch>
Cc: public-texttracks@w3.org, Loretta Guarino Reid <lorettaguarino@google.com>
Message-ID: <CAHp8n2my+8FnTh4j7WFD-CSK_xea+wUrw6-MOnHg4Dw55CxOSw@mail.gmail.com>
Hi Ian, all,

I continue to seek feedback from browser vendors about supporting
CEA608/708 features such as through the "Region" proposal.

However, in this email, I wanted to reply to a few statements made in this
thread that may influence decisions.


On Wed, Dec 5, 2012 at 9:23 AM, Ian Hickson <ian@hixie.ch> wrote:

> On Wed, 5 Dec 2012, Silvia Pfeiffer wrote:
> >
> > It is not blindly accepted, but provided by somebody who has been
> > involved in formulating the CVAA. Please go and check with the YouTube
> > lawyers if you do not take my word on this.
>
> Can you at least cite what paragraph of the CVAA you think requires what
> you describe?
>

I am not a lawyer and what I am saying now is not authorized by YouTube or
Google, so do not take my word for it but rather check with a real lawyer.

The key section of interest of the CVAA seems to be the following, FCC
79.103:
http://www.hallikainen.com/FccRules/2012/79/102/index.php

It talks about "all digital apparatus for video display" requiring caption
support, which includes set-top boxes, tablets, or other devices that use
browser rendering engines to display video. Particularly relevant is
section (c) that lists the technical capabilities that are expected to be
supported. This explicitly states pop-on, roll-up and paint-on captions, as
well as caption window colors (in contrast to caption text and caption text
background colors) and a bunch of other features that are all basically
taken from CEA708.


> > I'm not objecting to roll-up, I'm objecting to browsers (an
> > > especially WebVTT) supporting features for faithful representation of
> > > other caption formats, especially caption formats that have
> > > constraints that do not apply on the Web.
>

The request to have captions from TV be represented faithfully online is
not only something that the CVAA has put in legalese. It has also been a
requirement I have heard from many caption users and has been argued for on
this list and on bugs before.

Here's a quote from a deaf Googler:
".. as a deaf consumer, I consider it mandatory and non-negotiable that I
be able to view content online in the same way that it's currently
presented to me on broadcast TV."

This explicitly includes the 608/708 features such as rollup and window
background colors.

This user need has been rejected several times here as a use case, so now
we have to unfortunately use US law as an argument rather than user
requirements.



>  > Most of those features have been developed as a reaction to user
> > requirements, which are no different on the Web than they are on TV.
>
> The requirements are similar (though not even remotely identical, for
> example there's no way to drag and drop cues on a TV),


Indeed - captions on the Web should allow for more features than on TV,
because we have a highly interactive medium. However, right now captions on
TV allow more powerful display features than WebVTT.

For example, the FCC explicitly talks about the concept of a "caption
window" whose background color can be changed and made opaque as opposed to
just the background on the caption text. This is a concept that WebVTT does
not support right now and that the "Region" spec introduces.


but the constraints
> are vastly different (for example, live TV historically couldn't have a
> two-second delay loop and users care more about syncing captions to the
> picture than readable captions, but on the Internet a two-second delay is
> a non-issue even for live streams and so we can get readable captions and
> still get the sync right).
>

I will address live captioning - in particular captions for live video
conferencing - in a different thread because it requires further new
features that we've not considered or discussed before, so it's a less
mature request. Thus, I want a separate discussion for it.


> In fact, the only feature that is TV-specific is to avoid an overscan
> > area and we already support this in the spec:
> > https://www.w3.org/Bugs/Public/show_bug.cgi?id=16864 .
>
> That is by far not the only TV-specific feature of TV captions.
>
> The moving captions is one example (as discussed above); others include
> monospaced captions, having to fit captions into limited bandwidth, the
> lack of user input mechanisms, having to put the captions on the video,
> etc etc etc.
>

You may not be aware, but CEA708 includes 8 different fonts, some of which
are not monospaced. Also the user has a means (even if limited) of changing
fonts, colors and transparency, so there is an input mechanism.

While TV captions indeed have to deal with many more technical limitations
than captions on the Web, it is even more surprising that they still allow
encoding of more caption features than WebVTT.


> I could register bugs on the other issues (such as a separate styling of
> > text background to cue background; moving captions in the UA), but as
> > long as the big things are not solved (such as rollup; fixed
> > positioning), it seems moot to start worrying about the smaller issues
> > that fall out as a result of solving the big issues.
>
> If you want use cases considered, file bugs.
>

The mailing list is also an acceptable place to discuss use cases, in
particular since every bug that has been opened on roll-up captions was
rejected.

We've had these discussions on list and made extensive analysis over the
last year [1] [2] [3]. I picked one of the proposed solutions [4] and wrote
a conversion document for CEA608/708 features [5]. Since a solution in
WebVTT was continuously rejected, I designed a solution that is an optional
extension to existing features and yet simple to use. I made a JavaScript
implementation to demonstrate its use and put it forward on this list for
discussion with other browser vendors.

This thread here and now is a discussion of use cases plus a demonstration
of the possibilities when these use cases are addressed. I would prefer if
you picked it up, went over the spec and the use cases with a critical eye,
and added the requested use cases into WebVTT. I don't actually care if it
ends up being called "Region" or something else - we just need support for
these features.

Since you have already rejected the need for these features, I am asking
for input from other browser vendors if they want support for these
features, too. Maybe if their feedback is positive, it will help convince
you that these requirements are necessary. I'd much prefer if the spec was
developed consistently in your style.

Best Regards,
Silvia.

[1] http://www.w3.org/community/texttracks/wiki/Caption_Model
[2] http://www.w3.org/community/texttracks/wiki/RollupCaptions
[3] http://www.w3.org/community/texttracks/wiki/708Features
[4] http://www.w3.org/community/texttracks/wiki/MultiCueBox
[5]
http://dvcs.w3.org/hg/text-tracks/raw-file/default/608toVTT/608toVTT.html
Received on Tuesday, 11 December 2012 06:48:53 UTC