Re: Tech Discussions on the Multitrack Media (issue-152) from Mark Watson on 2011-03-02 (public-html@w3.org from March 2011)

From: Mark Watson <watsonm@netflix.com>
Date: Wed, 2 Mar 2011 14:47:08 -0800
To: Silvia Pfeiffer <silviapfeiffer1@gmail.com>
CC: Eric Carlson <eric.carlson@apple.com>, Sean Hayes <Sean.Hayes@microsoft.com>, David Singer <singer@apple.com>, public-html <public-html@w3.org>
Message-ID: <B701CBCD-0BA7-4C86-A5DC-28A1071DE60F@netflix.com>
On Mar 2, 2011, at 1:49 PM, Silvia Pfeiffer wrote:

> On Thu, Mar 3, 2011 at 4:22 AM, Eric Carlson <eric.carlson@apple.com> wrote:
>> Hi Sean -
>> 
>> On Mar 2, 2011, at 5:37 AM, Sean Hayes wrote:
>> 
>>> 
>>> There seem to be two possible interpretations of
>>>     <track kind="descriptions" srclang="en" label="English Audio Description">
>>>          <source src="audesc.ogg" type="audio/ogg">
>>>          <source src="audesc.mp3" type="audio/mpeg">
>>>      </track>
>>> One is that the audesc media is a complete replacement for the audio in the <source> element with the main soundtrack and descriptions pre-mixed, the second that the audesc is only the descriptions, and is intended to be mixed with the <source> audio at the client. Similar comments apply to the sign language option; and might even, if the captions option were of type video, apply to kind captions too where it is a replacement video with the captions burned in, or a smaller video with burned in captions to be placed PiP.
>>> 
>>> Do you intend that the 'kind' attribute be able to distinguish between these cases, or would it be better if there were a separate '@combinator={replace|add}' attribute that allows the author to mark these cases?
>>> 
>>  This is an interesting question but it isn't specific to our proposal, it applies to any multi-track solution.
>> 
>>  The UA can't know if an external track augments the main resource or is a replacement for an internal track, so I think the page author must be able to signal it in the markup. I am not wild about "combinator", but I don't have a better suggestion just yet.
>> 
>> eric
> 
> 
> Actually, when we have replacement and the replacement resource
> changes the timing of the whole virtually multitracked resource, it
> makes no sense to have that as part of this construct, since it
> influences all tracks. The timing of all tracks would change and thus
> other resources for all tracks would need to be loaded. For this case,
> a complete replacement <video> element would be much better IMHO. I am
> in particular thinking about extended audio description recordings
> here.
> 
> However, the case where we, e.g. want to replace an audio track with a
> dubbed version in another language, or a video track with a video
> track that has open captions, is much harder.  How common is this case
> though? Internationalization is generally done on the Web by creating
> a whole drop-in replacement site. Open captioned video should be
> discouraged anyway and people should be encouraged to spend the time
> re-authoring the captions in a way such that search engines and other
> tools can make use of the text. Audio description tracks can be
> authored without the original audio and thus be an "overlay" (i.e. an
> addition).
> 
> Overall, I think that the requirements for "addition" are much more
> common than for "replacement". Since replacement of resources can
> already be achieved through JavaScript, I wonder if there is really a
> need for inclusion of a markup solution for this. Is it really on the
> 80% side of the 80:20 use cases?
> 

I am not expert in accessibility requirements, but I can imagine the following things where you might want to replace a single track without modifying the timing of any other track (accessibility experts please correct me if any of these don't make sense):
- audio with descriptions, where the audio mixing has been changed compared to the original audio (e.g. lower the volume on the background music during the descriptions)
- audio with dialog enhancement (i.e. different mixing again)
- video with open captions (yes, to be discouraged, but we have some today and would like to offer it to users)
- video with sign language already embedded (I have often seen the signer overlaid on the corner of the video with the main video still visible behind - i.e. not PIP - to do this with an additional signing track requires an alpha channel in that signing track and alpha blending - not sure if that is universally available).
- high contrast video or video without e.g. strobe sequences which some people need to avoid (I know there is a term for this kind of video but can't remember).
- video with different content rating (e.g. some scenes modified or blacked out) - though I assume it would be more common to delete scenes which would change the timing - also the environments in which it makes sense to expose this choice over the JS API are not straightforward.

Maybe some of these will not be used often, but this is quite a few examples.

I don't see any problem with the default usage (additional or replacement) being implicit in the "kind" value. So we might define separate "kinds" for audio description tracks which are intended to be additional vs replacement (e.g. "descriptions" and "descriptions-alt"). Different languages audio tracks are by default replacements as are the other examples above. So there could be separate "kind" values for video with embedded sign language vs video with just the sign language (intended to be displayed PIP etc.)

When you say "replacement can already be achieved through Javascript" this doesn't change the requirement for the Javascript to find out what kind of tracks it has available so it can make decisions whether to turn them on/off. Again, everything is fine if the replacement/additional nature of the track is implicit in the kind.

...Mark


> Cheers,
> Silvia.
> 
>
Received on Wednesday, 2 March 2011 22:50:53 UTC