RE: HTML WG Note publication of sourcing in-band media resources

On 28 May 2014 10:26, "Paul Cotton" <Paul.Cotton@microsoft.com> wrote:
>
> From Sylvia's response to this week's WG Weekly agenda:
>
http://lists.w3.org/Archives/Public/public-html-wg-announce/2014AprJun/0016.html
>
> >> 7. Any other business
> >>
> >> a) HTML extension spec for sourcing in-band tracks
> >> http://lists.w3.org/Archives/Public/public-html-admin/2014May/0030.html
> >
> >The particular question I have for this is: how are we going to get it
published under a w3.org URL?
> >
> >We have contributed the spec to the W3C github account at
https://github.com/w3c/HTMLSourcingInbandTracks
> >so it is available at
> >http://rawgit.com/w3c/HTMLSourcingInbandTracks/master/index.html
>
> The Chairs and Team provided our initial response in:
> http://lists.w3.org/Archives/Public/public-html-admin/2014May/0030.html
> in which we recommended publishing this material using the model adopted
for the "Media Source Extensions Byte Stream Format Registry".

That's what we are working towards, except this is done through GitHub and
not through the w3c dvc server, which is why I asked. It takes me one
commit to put a fpwd onto it, too, of that helps. Would you like me to do
so?

A discussion and reply on Friday would be sufficient.

Regards,
Silvia.

> The Chairs and Team are now evaluating Bob's response to our initial
response:
> http://lists.w3.org/Archives/Public/public-html-admin/2014May/0034.html
> and the fact that the Media TF have an open  bug on how to update such a
Registry:
> https://www.w3.org/Bugs/Public/show_bug.cgi?id=25581
>
> Unfortunately due to other commitments the Chairs and Team have not been
able to discuss this matter since Bob sent us his response. We hope to do
this no later than by early next week.
>
> /paulc
> HTML WG co-chair
>
> Paul Cotton, Microsoft Canada
> 17 Eleanor Drive, Ottawa, Ontario K2E 6A3
> Tel: (425) 705-9596 Fax: (425) 936-7329
>
>
> -----Original Message-----
> From: Kilroy Hughes
> Sent: Monday, May 19, 2014 12:02 PM
> To: Silvia Pfeiffer; Philip Jägenstedt
> Cc: Jerry Smith (WINDOWS); Bob Lund; Paul Cotton; public-html-admin@w3.org;
Pierre-Anthony Lemieux
> Subject: RE: HTML WG Note publication of sourcing in-band media resources
>
> The ISO Base Media File Format Part 30 (ISO/IEC 14496-30) defines
subtitle tracks (which are inclusive of captions, SDH, description,
translation, graphics such as glyphs and signing, etc.).
>
> It doesn't say anything about Kinds, or have a similar field in the
standard track header and sample description.
>
> Both TTML and WebVTT storage are defined.
> I know TTML has generic metadata tags, but not a specific method of
identifying presentation objects such as <p> and <div> according to Kind;
nor any standardized concept of sub-track.
> You would know better if WebVTT content and readers conform to a
sub-track or Kind tagging method corresponding to two HTML tracks in the
same text file/track.
>
> In the case of DASH streaming timed text and graphics subtitles (ISO/IEC
23009-1) stored as Part 30 movie fragments, the manifest (Media
Presentation Description, MPD) may include optional Role Descriptor
elements that are intended to function like Kind to descript Adaptation
Sets that result in tracks when streamed in an HTML5 browser using MSE.
 The DASH standard was completed previous to W3C Kind specification, so
defines a slightly different vocabulary than that eventually settled on by
W3C.  It also allows multiple Role Descriptors because an Adaptation Set
(track) may fit multiple descriptions, such as "Main" or "Alternate" and
"Description" or "Translation".  The Role descriptor uses a URI/URN to
identify the vocabulary and syntax contained in the descriptor, so it is
extensible beyond the vocabulary defined in the DASH standard.
>
> An addition Accessibility Descriptor is specified in the DASH MPD schema
to allow automatic selection of audio, video, and TTML tracks for users
with visual, hearing, cognitive, etc. impairments.  A URI/URN can be
selected that labels these tracks with identifiers established by
regulation, broadcast TV, etc., such as "SDH" for Subtitles for Deaf and
Hard of hearing.  Even if a player does not recognize the particular
URI/URN or descriptive term used in this Descriptor, it can make a default
selection when a user preference setting indicates an impairment, based on
the presence of the Accessibility Descriptor, language attribute, etc.  It
may also have a Descriptor indicating "alternate" or similar, but that
would not be very useful for someone who is visually impaired or a standard
player that would like to find an audio description track.
>
> Selection of an Adaptation Set and a Representation contained in it for
adaptive streaming involves evaluating attributes that identify codec,
video resolution or audio track configuration, language, frame rate,
bitrate, etc. in addition to the Role or Kind. An Adaptation Set contains
perceptually equivalent content, but possibly multiple Representations that
are encoded differently to enable rapid switching to compensate for
variation in network throughput.  The intent is that Media Segments
adaptively selected and sequenced from different Representations within an
Adaptation Set will appear to be a continuous track on playback, so they
share the same Role Descriptor.  Although it is possible, it is unlikely
that a Subtitle Adaptation Set will contain more than one Representation.
>
> A single AdaptationSet element (track) may by described by e.g. one
Accessibility Descriptor and two Role Descriptors indicating a TTML track
was character coded Hiragana for children and blind readers of touch
devices, and was descriptive, so also suitable for hearing impaired
Japanese.  An alternative AdaptationSet (track) could be described by both
Accessibility and Role descriptors to describe painted Kanji glyphs, more
appropriate for adult hearing impaired readers, and more typical of the
majority of the world's cursive writing systems and subtitles used on
movies, video discs, and broadcast.
>
> Although there can be multiple descriptions of a track, there isn't
provision for multiple "sub-tracks" within a single TTML (or WebVTT?)
Adaptation Set or ISO Media track.
>
> There is one special case to consider, which is binary captions
encapsulated in AVC/HEVC elementary streams.  A video track will act like
two tracks when broadcast content containing e.g. CEA-608 or CEA-708 or
Teletext, etc. is played on a device with the appropriate caption
decoder(s).  These include iOS devices, game consoles, settop boxes, TVs,
etc.  It would be useful to identify if these broadcast captions are
present and turn them on/off; but that may be in the scope of W3C groups
working on tuner APIs, etc.
>
> Kilroy Hughes | Senior Digital Media Architect |Windows Azure Media
Services | Microsoft Corporation
>
>
> -----Original Message-----
> From: Silvia Pfeiffer [mailto:silviapfeiffer1@gmail.com]
> Sent: Monday, May 19, 2014 5:12 AM
> To: Philip Jägenstedt
> Cc: Jerry Smith (WINDOWS); Bob Lund; Paul Cotton; public-html-admin@w3.org;
Pierre-Anthony Lemieux
> Subject: Re: HTML WG Note publication of sourcing in-band media resources
>
> On Mon, May 19, 2014 at 10:02 PM, Philip Jägenstedt <philipj@opera.com>
wrote:
> > On Mon, May 19, 2014 at 1:29 PM, Silvia Pfeiffer
> > <silviapfeiffer1@gmail.com> wrote:
> >> On Mon, May 19, 2014 at 7:22 PM, Philip Jägenstedt <philipj@opera.com>
wrote:
> >
> >>> Finally, does ISO BMFF have SDH (subtitles for the deaf or
> >>> hard-of-hearing) as a separate flag from the subtitle and captions
> >>> kinds, or is possible to assign an arbitrary number of kinds to a
> >>> track? Either way it doesn't sound like it maps 1:1 to the HTML
> >>> track kinds.
> >>
> >> That's what I tried to say: since the ISO BMFF 'SDH' track contains
> >> both 'SDH' and 'subtitles' cues, it should be mapped to both a
> >> @kind='captions' track and also a @kind='subtitles' track where the
> >> cues that are marked to be for SDH only are removed.
> >
> > Are the individual cues really marked with that metadata? If they
> > aren't, then exposing such a single track with kind 'captions' seems
> > like the correct mapping.
>
> I was under that impression, but I haven't been able to confirm this.
> Maybe somebody else with actual MPEG4 specs can confirm / refute that
assumption?
>
> Cheers,
> Silvia.
>

Received on Wednesday, 28 May 2014 00:45:47 UTC