Re: discussion of video transcript - issue 194 from Benjamin Hawkes-Lewis on 2012-05-10 (public-html-a11y@w3.org from May 2012)

From: Benjamin Hawkes-Lewis <bhawkeslewis@googlemail.com>
Date: Thu, 10 May 2012 20:58:39 +0100
To: John Foliot <john@foliot.ca>
Cc: Silvia Pfeiffer <silviapfeiffer1@gmail.com>, HTML Accessibility Task Force <public-html-a11y@w3.org>, Judy Brewer <jbrewer@w3.org>
Message-ID: <CAEhSh3f+BrSdbtRXVgth4oUNp4yRkOB7eJoRMJopmSp+BQcyHg@mail.gmail.com>
On Thu, May 10, 2012 at 5:23 PM, John Foliot <john@foliot.ca> wrote:
> Benjamin Hawkes-Lewis wrote:
>>
>> <transcript>
>> <video id=v1 src=video.mp4></video>
>> <p>This is a on-page transcript.</p>
>> </transcript>
>
> Yech. From an authoring perspective, that is a very ugly and confusing pattern to propose, even if it would technically be viable. It seems to suggest that the video is a child of the Transcript, which is both false and *really* confusing.

Is it really more weird to make a <video> element descend from a
<transcript> element than to make a form field element descend from a
<label> element? I doubt it.

At any rate, I think it's less confusing to authors to adopt the
<label> association model wholesale, than to adopt it partially, even
if it leads to weirdness.

In practice, I expect a user agent algorithm that only looked for the
nearest <video> element or form field to find a <transcript> or
<label> would work better than an algorithm that only looked for a
@for/@id pairing.

> most transcripts will not be "on-page" due to their size and volume of content.

Why should we believe this?

On-page transcripts are not uncommon, bandwidth is getting cheaper,
transcripts tend to be concise, and text compresses well.

Even a movie length transcript will gzip down to 200K or less (for
example the script for Star Wars: A New Hope is about 75K). By
comparison, the initial load of a YouTube page and its resources
consumes about 10MB.

We don't really need to reach consensus about what the majority case
will be; I think it's fair to say at least some publishers would
prefer to link to transcripts, and at least some publishers would
prefer to inline transcripts.

However, the proposal already supports providing a link via the
<transcript> element.

> They will most likely be external documents in (hopefully) HTML, but also quite possibly PDF, Doc/Open Office, perhaps even Daisy - all viable and possible options today.

I don't think we need to support non-HTML transcripts, but the
proposal already supports linking them via <a> or transcluding them
with <iframe>.

> This seems to insist that the link to the transcript be an on-screen text link, which many of us feel is too simplistic a response.

I don't understand how this was a response to what I was saying. I'm
talking about improving the algorithm for associating <label> (and a
<transcript> element designed along the same model) to make authoring
easier and (hopefully) to handle the corpus better.

> If you want to take your analogy of <label> further, I ask you - how often do we currently see
> <label class="CSS_off_screen_because_onscreen_is_ugly">?
>
> (I will suggest the answer is Very Often)

It would be easy to use <transcript> as proposed, but have the
transcript hidden in a <details> element until revealed by pressing a
transcript button in the publisher provided controls.

> My concern is that using such a blunt work-around is hardly elegant

In what sense is the proposal a "work-around"?

> , and once again might suffer from the orphaned link-to-transcript problem when authors cut and paste a <video> all the stuff </video> block - remembering that what *they* want is the video first.

Ordinary users tend to include video in their web content today by
copying and pasting a provided widget from a video sharing site. It's
really easy for video site publishers to provide video widgets that
preserve transcripts by just providing an iframe to a page that
inlines, transcludes, or links to the transcript.

> My preference then would be that any solution discussed sees the vehicle for including the transcript be included inside the opening and closing <video> 'tags' (which @transcript, <track>, and apparently <transcript> would provide).

That would not pave the cowpath of including either a visible
transcript or a visible control to open the transcript.

It would also have a poorer backwards compatibility story, since
currently browsers hide the content inside <video>.

> From the 'elegance' perspective, my preference would also be for something that could easily be incorporated into both the native as well as author-scripted controls that HTML5 offers. This would consistently follow a precedent already established by media players today, that expose a "CC" button inside of the controls bar whenever Closed Captions are available. While this would by no means restrict other presentation/delivery patterns, I would go so far as to suggest it might be recommended as the default way of doing so.

I have no problem with allowing UAs to expose a "View Transcript"
command on top of the semantics in the proposal. This command could be
exposed as a button and/or a context menu item. We could suggest that
UAs that do implement UI could open <details> or <dialog> elements,
and follow hyperlinks, as appropriate. It would certainly make sense
to expose it as an accessible action in the accessibility API mapping.
I don't quite share the Proposal's scepticism about providing native
visual UI here (for example, I can imagine a UI pulling the transcript
into a full screen view) but I like the fact that the Proposal does
not depend on such UI.

Probably any proposal, like this one, that provides an unambiguous
association between a transcript and its video would be possible to
work into both native and scripted controls.

--
Benjamin Hawkes-Lewis
Received on Thursday, 10 May 2012 19:59:29 UTC