Re: CP, ISSUE-30: Link longdesc to role of img [Was: hypothetical question on longdesc] from Dave Singer on 2012-03-22 (public-html-a11y@w3.org from March 2012)

From: Dave Singer <singer@apple.com>
Date: Thu, 22 Mar 2012 08:17:58 -0700
To: Janina Sajka <janina@rednote.net>
Cc: Silvia Pfeiffer <silviapfeiffer1@gmail.com>, John Foliot <john@foliot.ca>, Sean Hayes <Sean.Hayes@microsoft.com>, "'"'xn--mlform-iua@målform.no'"'" <xn--mlform-iua@xn--mlform-iua.no>, rubys@intertwingly.net, laura.lee.carlson@gmail.com, mjs@apple.com, Paul Cotton <Paul.Cotton@microsoft.com>, public-html-a11y@w3.org, public-html@w3.org
Message-id: <9DE2779D-B1C8-43E5-8E00-72B21860BBC7@apple.com>

On Mar 22, 2012, at 7:07 , Janina Sajka wrote:

> Silvia Pfeiffer writes:
>> On Thu, Mar 22, 2012 at 6:05 PM, John Foliot <john@foliot.ca> wrote:
>>> Silvia Pfeiffer wrote:
>>>> 
>>>> <track> is a timed resource. Neither transcript, nor description, nor
>>>> posterdescription are timed - they cannot be parsed into cues and
>>>> displayed time-synchronously over the video. You cannot misuse the
>>>> track element in this way.
>>> 
>>> Hmmm... I don’t recall seeing anywhere where it states that @kind="metadata"
>>> was required to be a timed resource - is that specifically stated somewhere?
>> 

I think that it was intended to be clear that a 'track' is a timed resource, and if that's not explicitly stated, it should be.

>> The whole concept of text tracks is built around timed cues.
>> 
> In retrospect perhaps calling them "text" tracks is an unfortunate
> misnomer. "Timed" tracks might have been better, less readily
> misunderstood.

I think "timed" tracks is a tautology; tracks are timed.

 The point of the word "text" there is not that the content of the track is encoded as text, but that the presentation of the timed material is textual.  Tracks can have (at least) 3 'types', but they are always timed:
* their encoding type; for example, for video this is the codec type;
* their presentation type: video, audio, text;  'meta-data' is a presentation type that says that the data is not directly presented, but suitably interpreted by the UA;
* their functional type, what we call "kind": what role they play (primary video, sign-language video, and so on).

To a very large extent I think these are orthogonal.  For example, I can remember ASCII art; it would be possible to imagine ASCII video (a succession of ASCII art images) which would be of presentation-type video but encoded as text.

> I suppose the term "text" tracks is a holdover from the early days when
> some of the HTML 5 people hadn't yet grocked the extensive range of
> alternative media required to support accessibility, i.e. a "sign
> language translation" is not a "text" track, though a caption certainly
> is.

I would expect a sign-language track to be a video track.  I am not aware of a system for encoding sign-language directly, so at the moment (e.g. the BBC sign-language from a few years ago) sign language is a video of someone. I can certainly imagine a custom 'sign-language codec' that allows an avatar to be used to make the signing.

> I recall our meeting at Stanford some years hence. Those of us from
> accessibility who were just joining the HTML 5 work effort found
> ourselves rather concerned at how disability support was being called
> "captioning," causing us to start insisting on a requirements gathering
> phase.

I presented at least the following timed needs:
* captioning
* repetitive stimulus avoidance
* color-blindness and sensitivity
* need for high or low contrast video
* clear audio
* audio description of video
* sign language
and the following un-timed ones:
* short and long alternative text
* transcripts

and other people spoke of the need for text-to-speech, and braille output.  My recollection was that the landscape was rather well sketched out and by no means confined to captions.

Dave Singer
Multimedia and Software Standards, Apple

singer@apple.com

Received on Thursday, 22 March 2012 15:19:02 UTC