Re: @longdesc scope (was: HTML Media Transcript, Issue-194: Are we done?) from Leif Halvard Silli on 2012-07-10 (public-html-a11y@w3.org from July 2012)

From: Leif Halvard Silli <xn--mlform-iua@xn--mlform-iua.no>
Date: Tue, 10 Jul 2012 14:17:45 +0200
To: Silvia Pfeiffer <silviapfeiffer1@gmail.com>
Cc: John Foliot <john@foliot.ca>, David Singer <singer@apple.com>, Chaals McCathieNevile <w3b@chaals.com>, HTML Accessibility Task Force <public-html-a11y@w3.org>
Message-ID: <20120710141745963149.8332a2a9@xn--mlform-iua.no>

Silvia Pfeiffer, Tue, 10 Jul 2012 09:39:49 +0200:

> I've tried to argue this line of thought before, too.
> 
> I've come to the conclusion that a transcript is indeed one type of
> long description

+1

> and likely sufficient as a text replacement for
> video. However, there are other types of long descriptions that people
> may also want to publish and the @longdesc (@aria-describedat)
> attribute is more appropriate for those. I personally don't think
> those other types of long descriptions are necessary, since if you
> have a full-text transcript (or better even a collated transcript [1])
> you get all the information that you need - and summaries are usually
> published somewhere else on the page, such as in a description
> section, so @aria-describedby is more appropriate there. But I've come
> to accept that there may not always be a transcript and such other
> type of long description may be easier to author and publish then.

> [1] http://www.w3.org/TR/WCAG10-CORE-TECHS/#collated-transcripts

As I just said to Laura,[1] that section of the WCAG 1.0 techniques 
seemingly operates with a 'transcripts from sounds only' concept. 
However, the preceding section about 'Visual information and motion' 
makes it clear that a collated transcript may include transcriptions 
from an auditory description track. It goes on to say that "Auditory 
descriptions are used primarily by people who are blind". However, for 
someone reading a transcript, it usually doesn't matter who it was made 
for, as one does usually not consume the transcript in parallel with 
the running video, but as an independent piece of art.

It seems that WCAG 1.0 takes a ideal approach: The important visual 
things are represented as audio too, and then all the relevant tracks - 
the main sound track the describing auditory track - is joined into a 
collated transcript.

I suppose it takes this ideal approach in order to emphasize that, for 
the blind, there should be an auditory description track. And I agree 
that this is ideal and practical when there is such a descriptive 
auditory track.

However, the important point, for such a transcript, cannot be that it, 
also when it comes to the descriptions, is a transcript from the "how 
it was made" perspective. Thus, if there is no - and will not be - an 
auditory descriptive track, then I see only benefit in adding 
descriptions directly to the transcript.

This might, however, mean that you cannot 100% hand over the 
transcription job to a transcription office, as this would require the 
office to consume the video as artwork in order to clarify which visual 
events that need to be added to the transcript.

However, I think the emphasize on transcripts being transcripts, also 
in the how-it-was-made sense, probably have two good effects:
 1) It emphasizes the need for accessible audio track(s);
 2) It removes "creativity" from the process: Almost only the
    things that fits within the video goes into the transcript.
A little bit of this attitude could probably with success be added to 
the process of creating textual alternatives for images too.

[1] 
http://www.w3.org/mid/20120710113121173690.938ca0f1@xn--mlform-iua.no
-- 
Leif H Silli

Received on Tuesday, 10 July 2012 12:18:22 UTC