Re: Track kinds from David Singer on 2011-05-06 (public-html-a11y@w3.org from May 2011)

From: David Singer <singer@apple.com>
Date: Fri, 06 May 2011 16:50:26 -0700
To: Silvia Pfeiffer <silviapfeiffer1@gmail.com>
Cc: Mark Watson <watsonm@netflix.com>, HTML Accessibility Task Force <public-html-a11y@w3.org>
Message-id: <5BFED7C8-9CAF-472A-B30D-0ED7D90BDBD7@apple.com>
On May 6, 2011, at 16:25 , Silvia Pfeiffer wrote:

> On Sat, May 7, 2011 at 9:16 AM, Mark Watson <watsonm@netflix.com> wrote:
>> 
>> On May 6, 2011, at 2:18 PM, Silvia Pfeiffer wrote:
>> 
>> On Sat, May 7, 2011 at 5:22 AM, Mark Watson <watsonm@netflix.com> wrote:
>> 
>> I also have a procedural question: do we consider that we have received the
>> 
>> liaison from 3GPP (mentioned on the above page) ? Are we going to answer it
>> 
>> ?
>> 
>> There should be an email by a 3GPP member with all the relevant
>> requests to the WG list. I haven't seen one such.
>> 
>> I doubt the 3GPP staff will post to the list. They will send the liaison to
>> the chair of the group, possibly via W3C staff members, and expect them to
>> distribute it to the list. I imagine it is stuck somewhere in that path.
>> 
>> Unless that arrives,
>> there is nothing to reply to.
>> 
>> We can see that it has been sent from their minutes and we can see the text
>> of the document in their archives, so we can at least prepare for its
>> arrival ;-)
> 
> 
> Do poke the chairs to forward it, or ask them to reply to it. It
> doesn't seem right for us to just go ahead and try to represent the
> HTML WG in a reply to the 3GPP WG.

I just poked Philippe as I suggested it be sent to him for him to send on.

> 
> 
>> One question they ask is whether we will define a URN to identify the
>> 
>> space of kind values defined by W3C. One advantage of doing that is that
>> 
>> these kinds are then immediately supported in 3GPP and MPEG adaptive
>> 
>> streaming manifests, which means that there *is* a media container
>> 
>> supporting those kinds (perhaps addressing one of the editor's concerns
>> 
>> about new kind values).
>> 
>> I don't think HTML needs these values as anything else but short
>> strings that we now have.
>> 
>> Agreed.
>> 
>> If 3GPP and MPEG need them as URNs
>> 
>> No, they need a URN that refers to the codepoint space containing these
>> values. But I notice W3C doesn't appear to have a top-level URN space. It
>> could use urn:fdc:w3.org:2011....
> 
> Seems to make sense to me... would such a URN be useful for anything else?
> 
> 
>> The 3GPP and DASH specs allow you to tag a track with any number of "Roles"
>> - so it can be tagged with roles from this W3C space (if anyone defines a
>> URN) as well as equivalent roles from other spaces if necessary for some
>> application.
>> 
>> - and they probably have a swag
>> more that they have proposed and will use
>> 
>> Not really - they'd prefer W3C to define some container-format-independent
>> ones, as I understand it.
>> 
>> - it makes a lot more sense
>> to me for them to define these themselves.
>> 
>> They seem to think the opposite ;-) In particular it's been stated at MPEG
>> that the W3C HTML a11y group has more a11y expertise than the MPEG group.
>> And there is nothing container-format-specific about this concept of track
>> kinds. The abstract kinds and their definitions need to be defined somewhere
>> with responsibility for all containers or for none, not in the groups
>> focussed on particular containers. The containers (and HTML) just need to
>> define how those kinds are labeled in their syntax.
> 
> Well, MPEG caters for a lot more than just the Web and the kinds
> required elsewhere may need to be a lot more diverse. So, I agree that
> they do well in looking to us for a11y related kinds and it's great
> that they want to pick up the ones that we define here, but I highly
> doubt that will be the end of their list.
> 
> 
>> Some of the kinds that
>> media containers will expose may just end up creating getLabel() text
>> rather than be exposed in getKind(). Ogg already seems to have a few
>> of those.
>> 
>> Agreed, but I think this is a symptom of having media container formats lead
>> the definition of kinds: they define kinds which are either not very
>> well-defined or not universally applicable and so we decide not to expose
>> these over HTML because we want the HTML interface to be clean and
>> well-defined. The best way to achieve that is to define these things in a
>> container-independent place and ask the containers to align.
> 
> 
> Well, we don't need anyone to align. All we need is a mapping, which
> your wiki page provides perfectly. If the names are identical, the
> better so. It's ok when not everything that a container can contain is
> exposed to HTML. Only what gets exposed needs to be compatible. I
> think we're on the best way there anyway. :-)
> 
> Cheers,
> Silvia.
> 
> 
>> 
>> Finally, we discussed the "commentary" kind here at Netflix and in the end
>> 
>> we are happy to have it dealt with simply as "alternative". I do think
>> 
>> though that in principle there could be other (UI-related) reasons for
>> 
>> exposing a new track kind than triggering default behavior or application of
>> 
>> user preferences. This is certainly the case for accessibility use-cases
>> 
>> where the UI to enable/disable a particular track could usefully be tailored
>> 
>> to the intended users of that track (for example, enabling/disabling tracks
>> 
>> intended for the blind or those with low vision should ideally not involve
>> 
>> complex visual UI elements).
>> 
>> OK. I think we'll cross that bridge when we get to it.
>> 
>> Cheers,
>> Silvia.
>> 
>> 
>> ...Mark
>> 
>> 
>> On May 3, 2011, at 10:12 PM, Silvia Pfeiffer wrote:
>> 
>> I understand the problem of additive/alternative tracks, too, and have
>> 
>> tried to approach it with markup before. However, I think this is
>> 
>> making something that is supposedly simple much too difficult. The
>> 
>> ultimate choice of active tracks has got to be left to the user. For
>> 
>> this reason, I think @kind (or getKind()) should only ever expose what
>> 
>> content is available in the track, but there should not be an
>> 
>> automatic choice made by the browser. It's up to the user to
>> 
>> activate/deactivate the correct tracks.
>> 
>> Before we dive into anything more complex, we should get some
>> 
>> experience with an implementation of multitrack and the roles. I don't
>> 
>> think we will have much to go by for making a decision beforehand.
>> 
>> Cheers,
>> 
>> Silvia.
>> 
>> 
>> On Wed, May 4, 2011 at 2:02 PM, Mark Watson <watsonm@netflix.com> wrote:
>> 
>> So, if we are looking for a generic approach, where a track can have
>> 
>> multiple "roles", then I think the correct logic is indeed to pick the
>> 
>> fewest number of tracks which fulfill the intersection of the desired roles
>> 
>> and the available roles such that no role is fulfilled more than once. You
>> 
>> need a priority list of roles to drop from the desired list if that isn't
>> 
>> possible (which would mean some badly authored content, but has to be dealt
>> 
>> with). It may be a mouthful, but I think it would be reasonably
>> 
>> straightforward to implement.
>> 
>> However, I'm still not sure a generic approach is necessary. A "simpler"
>> 
>> approach is to say every track has a single role. But for some applications
>> 
>> (like audio descriptions) there are two distinct role values defined - an
>> 
>> additive one and an alternative one. The problem is addressed at a semantic
>> 
>> level - i.e. people implement support for audio descriptions - and they know
>> 
>> what these are and how to handle them - rather than trying for a generic
>> 
>> descriptor matching algorithm.
>> 
>> Regarding Repetitive Stimulus Safe, I guess that since most content is
>> 
>> unfortunately not labeled one way or the other the default assumption has to
>> 
>> be up to the user themselves. i.e. that user preferences associated with
>> 
>> this aspect should support required, preferred and don't care. In a really
>> 
>> generic approach every role may have a status from { require, prefer, don't
>> 
>> care, prefer not, require not }.
>> 
>> Again, this suggests that a generic approach might be over-ambitious - who
>> 
>> says some new role doesn't come along next week with a sixth user-preference
>> 
>> status of "required unless role Y present" or similar ... I think maybe the
>> 
>> UA needs to understand what these things are and act appropriately.
>> 
>> ...Mark
>> 
>> 
>> 
>> 
>> 
>> On May 3, 2011, at 11:55 AM, David Singer wrote:
>> 
>> 
>> On May 2, 2011, at 16:55 , Mark Watson wrote:
>> 
>> 
>> I think it's evidence that there is something to be solved.
>> 
>> I'd prefer a solution where adding a track to an existing presentation
>> 
>> didn't require me to change the properties of existing tracks, though, since
>> 
>> there is an error waiting to happen in that case.
>> 
>> 
>> Yes.  This idea made some sense when it was the tracks in a multiplex (e.g.
>> 
>> MP4 file), perhaps makes sense when all the tracks are annotated in the
>> 
>> markup (e.g. in HTML5 or DASH MPD) but makes much less sense when some
>> 
>> tracks are in a multiplex and some are added in the markup - a track added
>> 
>> in the markup might need the annotations in a multiplex changed, ugh.
>> 
>> So, thinking out loud here.
>> 
>> Assume the user has a set of roles that they would kinda like to experience.
>> 
>>  The default is 'main, supplementary', I think, or something like.
>> 
>> Now, we have a set of tracks, each of which satisfies some roles.  Let's
>> 
>> ignore tracks we have discarded because they are the wrong mime type, codec,
>> 
>> language, etc., and focus just on this selection mechanism.  What is the
>> 
>> right simple way to get the set of tracks?
>> 
>> It's easy to 'go overboard' and treat this as a very general problem of
>> 
>> finding the minimal set of tracks that will span a set of design roles.  I
>> 
>> don't think anyone will author *for the same language*
>> 
>> track - main
>> 
>> track - captions
>> 
>> track - main +  captions
>> 
>> so an algorithm designed to pick only (3) instead of (1 + 2) for the
>> 
>> main+captions desiring user is probably overkill.
>> 
>> 'enable the tracks whose roles are a subset of the desired roles, and
>> 
>> disable the rest' may be too simple, unless tracks are ordered from the
>> 
>> most-labelled to the least-labelled.
>> 
>> So, audio-description replacing the main audio:
>> 
>> track - main description
>> 
>> track - main
>> 
>> Audio description adding to the main audio
>> 
>> track - main
>> 
>> track - description
>> 
>> The same works for all the adaptations that might require re-authoring or
>> 
>> might be achievable with an additional track (captions, burned in or
>> 
>> separate, for example).
>> 
>> 
>> Where this fails is when the 'base content' is good enough for both the
>> 
>> plain user and the user who desires more roles.  The obvious case here (Mark
>> 
>> will laugh) is repetitive-stimulus-safeness;  we have to assume unlabelled
>> 
>> content is unsafe, but much content is naturally safe and can be labelled as
>> 
>> such.
>> 
>> David Singer
>> 
>> Multimedia and Software Standards, Apple Inc.
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
> 

David Singer
Multimedia and Software Standards, Apple Inc.
Received on Friday, 6 May 2011 23:50:55 UTC