W3C home > Mailing lists > Public > public-html@w3.org > September 2008

Re: Acessibility of <audio> and <video>

From: Henri Sivonen <hsivonen@iki.fi>
Date: Sun, 7 Sep 2008 16:36:32 +0300
Cc: HTML WG <public-html@w3.org>
Message-Id: <AB1E016A-51BE-4E6A-9335-E68C944BDF18@iki.fi>
To: Eric Carlson <eric.carlson@apple.com>

On Sep 6, 2008, at 00:40, Eric Carlson wrote:

> On Sep 5, 2008, at 12:02 AM, Henri Sivonen wrote:
>
>> On Sep 4, 2008, at 18:29, Eric Carlson wrote:
>>
>>> On Sep 4, 2008, at 1:17 AM, Henri Sivonen wrote:
>>>
>>>> On Sep 4, 2008, at 01:13, Dave Singer wrote:
>>>>
>>>>> 2.1.2 Configuring
>>>>> Sometimes, similarly, the media format itself can carry optional  
>>>>> features. An example might be the 3GPP file format (or any file  
>>>>> format from that family, such as MP4) with a text track in 3GPP  
>>>>> Timed Text format. Enabling this track (and thereby causing it  
>>>>> to be presented) may be a way to satisfy a need within a single  
>>>>> media file.
>>>>
>>>> It seems to me that for captioning, an off-by-default track  
>>>> within the main file is preferable over burned-in open captions,  
>>>> because tracks within the main file travel better, compress  
>>>> better (and transferring the captions even when not needed is not  
>>>> burdensome in terms of relative network bandwidth) and make video  
>>>> more searchable.
>>>>
>>> I am not sure if you are suggesting otherwise, but a a 3GPP Timed  
>>> Text track is exactly what you describe: a relatively small text  
>>> track carried within the media file. It may nor may not be enabled  
>>> by default, that is a decision that is made at authoring time.
>>
>> Yes, I mean 3GPP Timed Text in the MP4 context. Possibly Kate in an  
>> Ogg context, but that isn't sorted out yet.
>>
>> What kind of metadata about captions vs. subtitles, on-by-default  
>> vs. off-by-default and language can MP4 contain about an 3GPP  
>> track? Does QuickTime expose in the API whatever the file format  
>> can express here?
>>
>  A 3GPP text track is just a text track. It can be used for "sub  
> titles" or "close caption" text, but that is up to the media  
> producer and/or consumer. Any type of track can be enabled or  
> disabled by default, and can be tagged with a language code.

Nice. That seems to be good enough.

What software do people use for authoring 3GPP Timed Text captioning  
or subtitles?

> QuickTime movie (.mov) and Apple MPEG-4 (.m4v) can have Closed  
> Caption tracks, which carry CEA608 data with timing, style, screen  
> position, etc information.

I presume this is for supporting existing production workflows with  
iTunes TV shows and movie rentals. Is it not feasible or desirable to  
convert 608 automatically to 3GPP Timed Text?

>>>>> I would guess that content providers would opt for alternative  
>>>>> files in this case, because additional audio tracks show up on  
>>>>> the bandwidth bill if served even when not needed.
>>>>
>>> This is not necessarily true. Even for progressive download files,  
>>> some media-subsystems only read the parts of a file necessary for  
>>> the presentation.
>>
>> How does this work?
>>
>  In several different ways:
>
> - In some contexts, the QuickTime IO sub-system downloads media data  
> only when it is requested by the media handling sub-system (allowing  
> for latency of course), so only data that will be presented is loaded.

Does this happen when all the media data is muxed into one MP4 file  
served over HTTP?

> - A QuickTime movies doesn't hae to be self contained, but can  
> reference media in external files. Even very old versions of the  
> QuickTime browser plug-in (circa 1997) don't download data from  
> external files unless it is in an enabled track.

Does this apply or could this apply to either MP4 or Ogg (as opposed  
to .mov)?

> - Media data in streamed movies (eg. rtsp) is never downloaded until  
> it is needed.

Does Safari support rtsp URIs to MPEG-4 family streams in <video>?  
What about Ogg family streams if XiphQT is installed?

>>>>> We therefore also need the ability to apply the same preferences  
>>>>> used for selection, to configuring the file. Note that not all  
>>>>> media sub-systems will offer the user-agent such an API; that is  
>>>>> acceptable  for media files associated with those systems, the  
>>>>> files are not configurable and selection must be used instead.
>>>>
>>>> This seems alarming. Does at least one of QuickTime, GStreamer or  
>>>> DirectShow lack such an API? If one of those lacks such an API,  
>>>> can such an API be put in place in a timely manner?
>>>>
>>>> It seems to me that if automatic selection isn't reliable,  
>>>> content providers will shy away from an automatic selection system.
>>>>
>>> I believe David is pointing out that not all sub-systems have this  
>>> capability to emphasize that is it crucial for content authors to  
>>> be able to structure their markup so one of several files is  
>>> selected based on the user's stated preferences.
>>
>> QuickTime, GStreamer and DirectShow seem to be the subsystems that  
>> make or break the proposal. Is this a problem with those three?
>>
>  I don't understand how the (in)abilities of these sub-systems make  
> or break the proposal. David's proposal allows for sub-system that  
> support alternates within a media file as well as those that do not.  
> If a media format/syb-system does not allow alternates, the content  
> author using that format can create alternate files and instruct the  
> UA to select among them with alternate <source> elements.

In practice, requiring the author to create alternatives with text  
tracks and with a rasterization of text burned into the video data  
will very likely lead to a lack of author confidence in the  
reliability of text tracks. This would be bad in many ways. One  
possible bad outcome is authors opting for single-implementation  
technology like Flash in order to get for predictable results.

-- 
Henri Sivonen
hsivonen@iki.fi
http://hsivonen.iki.fi/
Received on Sunday, 7 September 2008 13:37:19 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Wednesday, 9 May 2012 00:16:23 GMT