Re: Proposal from HbbTV from Nigel Megitt on 2014-09-30 (public-inbandtracks@w3.org from September 2014)

From: Nigel Megitt <nigel.megitt@bbc.co.uk>
Date: Tue, 30 Sep 2014 10:10:11 +0000
To: Silvia Pfeiffer <silviapfeiffer1@gmail.com>
CC: Alexander Adolf <alexander.adolf@condition-alpha.com>, "public-inbandtracks@w3.org" <public-inbandtracks@w3.org>, Jon Piesing <Jon.Piesing@tpvision.com>
Message-ID: <D0503E15.12432%nigel.megitt@bbc.co.uk>


On 28/09/2014 22:10, "Silvia Pfeiffer" <silviapfeiffer1@gmail.com> wrote:

>On Thu, Sep 25, 2014 at 10:20 PM, Nigel Megitt <nigel.megitt@bbc.co.uk>
>wrote:
>>> Motte to the point though: a text track with 0 reported cues is
>>> indistinguishable from a text track where all cues failed parsing.
>>>This it's
>>> not obvious whether that will be a usable track or not. It's therefore
>>>not
>>> really a text track, but something special that the platform hasn't
>>> considered yet.
>>
>> Is that exactly correct? Let's look at the mode of the text track and
>>it's
>> readiness state:
>>
>> According to
>> 
>>http://dev.w3.org/html5/spec-preview/media-elements.html#text-track-mode:

>>
>> "Disabled
>>
>> Indicates that the text track is not active. Other than for the
>>purposes of
>> exposing the track in the DOM, the user agent is ignoring the text
>>track. No
>> cues are active, no events are fired, and the user agent will not
>>attempt to
>> obtain the track's cues."
>>
>> And at 
>>http://dev.w3.org/html5/spec-preview/media-elements.html#text-track :
>>
>> "The text tracks of a media element are ready if all the text tracks
>>whose
>> mode was not in the disabled state when the element's resource selection
>> algorithm last started now have a text track readiness state of loaded
>>or
>> failed to load."
>>
>> And at
>> 
>>http://dev.w3.org/html5/spec-preview/media-elements.html#text-track-faile

>>d-to-load
>> :
>>
>> "Failed to load
>>
>> Indicates that the text track was enabled, but when the user agent
>>attempted
>> to obtain it, this failed in some way (e.g. URL could not be resolved,
>> network error, unknown text track format). Some or all of the cues are
>> likely missing and will not be obtained."
>>
>> Taken together these suggest to me that it's legitimate to create a text
>> track and set it deliberately to mode="disabled" without loading cues,
>>or to
>> set it to, say, "showing" and proceed as though it is "ready" even
>>though
>> its readiness state is "failed to load", specifically in this case
>>because
>> the text track format is unknown. That at least provides a mechanism to
>> control a media object that can dereference the text track object into
>> something concrete in the media that it can present, which is what's
>>needed
>> here.
>
>Thanks for walking through this. You are explaining my point very
>well, but you need to keep reading. The HTML spec says:
>
>"Whenever a text track's text track readiness state changes to either
>loaded or failed to load, the user agent must remove it from any list
>of pending text tracks that it is in."
>
>Thus, a track that has 'failed to load' is one that is ignored by the
>browser and cannot display any cues.
>
>But that's not even what is going to happen for an in-band track where
>the UA renders all the cues, but doesn't expose them as TextTrackCue
>objects. Here's what happens there: the track will be 'disabled'.
>Then, when it is selected, the UA will go into 'loaded' state once it
>has parsed all the data and loaded into internal memory. It won't
>reach 'failed to load' because the data was able to be obtained and
>loaded with no fatal errors. However, since it doesn't expose cues,
>the @cue TextTrackCueList in the TextTrack object of the video element
>will have 0 cues. Thus, if the JavaScript developer checks on how many
>cues are being rendered and at what times, they will see "0" and have
>to assume that the browser has failed to parse any cues. The only
>reasonable conclusion for the JS developer is to assume that the
>loading of all cues failed and thus the track is not usable.

That's an unreasonable assumption since if that were the case then the
state should be 'failed to load'. If it's not clear already then we should
make it so, i.e. that the assumption is that the cues were parsed but
there are no cues exposed, either because the track actually contained no
cues or because the cues that were present were not exposed.


>>>It's likely better exposed add a video track with burnt-in captions. I'd
>>> recommend that's how it would be shown in the track list. When
>>>activated,
>>> both the default video track and the captions track would then be
>>>rendered.
>>
>> This pushes the interface complexity somewhere else, but not somewhere
>> helpful! I'd argue that the spec should get as close as possible to
>>matching
>> the media element model and using text tracks for this purpose is better
>> than not doing so.
>
>Why is it not helpful? From the JS and user's point of view, that's
>exactly what such a track is: a video track with burnt in captions.
>Since it's now exposed in the list of video tracks, it can be selected
>and activated. That's all that's required for such a track. That's as
>useful as it gets, isn't it?

See the comments others have made (including you) later in the thread
about the relationship between video and audio tracks and text tracks.


>>By the way, I agree that exposing data provides interesting opportunities
>> for developers, where possible. At least creating the text tracks
>>provides
>> the location for where such data might go, in case an implementation
>>wants
>> to put it somewhere; hiding the tracks away behind a 'burnt in video'
>>would
>> effectively block that.
>
>What do you mean by "where such data might go"? If the UA renders the
>data, it can only render it within the video viewport, so for all
>intents and purposes, it is video data.

I mean 'in the text track cue list', if not in a different subclass of
TextTrack that offers some other data structure. I wouldn't assume that UA
rendering can only result in pixels being drawn in the video viewport: for
example there could be connections to other display or rendering devices.

Kind regards,

Nigel


>
>Regards,
>Silvia.
Received on Tuesday, 30 September 2014 10:10:49 UTC