- From: Bob Lund <B.Lund@CableLabs.com>
- Date: Tue, 3 Jun 2014 21:48:37 +0000
- To: Paul Cotton <Paul.Cotton@microsoft.com>, "public-html-admin@w3.org" <public-html-admin@w3.org>
- CC: Pierre-Anthony Lemieux <pal@sandflow.com>, Kilroy Hughes <Kilroy.Hughes@microsoft.com>, Philip Jägenstedt <philipj@opera.com>, "Jerry Smith (WINDOWS)" <jdsmith@microsoft.com>, "Silvia Pfeiffer" <silviapfeiffer1@gmail.com>
Chairs and HTML Admin Group, I was wondering if you¹ve concluded your evaluation of my response. Thanks, Bob Lund On 5/27/14, 6:26 PM, "Paul Cotton" <Paul.Cotton@microsoft.com> wrote: >From Sylvia's response to this week's WG Weekly agenda: >http://lists.w3.org/Archives/Public/public-html-wg-announce/2014AprJun/001 >6.html > >>> 7. Any other business >>> >>> a) HTML extension spec for sourcing in-band tracks >>> http://lists.w3.org/Archives/Public/public-html-admin/2014May/0030.html >> >>The particular question I have for this is: how are we going to get it >>published under a w3.org URL? >> >>We have contributed the spec to the W3C github account at >>https://github.com/w3c/HTMLSourcingInbandTracks >>so it is available at >>http://rawgit.com/w3c/HTMLSourcingInbandTracks/master/index.html > >The Chairs and Team provided our initial response in: >http://lists.w3.org/Archives/Public/public-html-admin/2014May/0030.html >in which we recommended publishing this material using the model adopted >for the "Media Source Extensions Byte Stream Format Registry". > >The Chairs and Team are now evaluating Bob's response to our initial >response: >http://lists.w3.org/Archives/Public/public-html-admin/2014May/0034.html >and the fact that the Media TF have an open bug on how to update such a >Registry: >https://www.w3.org/Bugs/Public/show_bug.cgi?id=25581 > >Unfortunately due to other commitments the Chairs and Team have not been >able to discuss this matter since Bob sent us his response. We hope to do >this no later than by early next week. > >/paulc >HTML WG co-chair > >Paul Cotton, Microsoft Canada >17 Eleanor Drive, Ottawa, Ontario K2E 6A3 >Tel: (425) 705-9596 Fax: (425) 936-7329 > > >-----Original Message----- >From: Kilroy Hughes >Sent: Monday, May 19, 2014 12:02 PM >To: Silvia Pfeiffer; Philip Jägenstedt >Cc: Jerry Smith (WINDOWS); Bob Lund; Paul Cotton; >public-html-admin@w3.org; Pierre-Anthony Lemieux >Subject: RE: HTML WG Note publication of sourcing in-band media resources > >The ISO Base Media File Format Part 30 (ISO/IEC 14496-30) defines >subtitle tracks (which are inclusive of captions, SDH, description, >translation, graphics such as glyphs and signing, etc.). > >It doesn't say anything about Kinds, or have a similar field in the >standard track header and sample description. > >Both TTML and WebVTT storage are defined. >I know TTML has generic metadata tags, but not a specific method of >identifying presentation objects such as <p> and <div> according to Kind; >nor any standardized concept of sub-track. >You would know better if WebVTT content and readers conform to a >sub-track or Kind tagging method corresponding to two HTML tracks in the >same text file/track. > >In the case of DASH streaming timed text and graphics subtitles (ISO/IEC >23009-1) stored as Part 30 movie fragments, the manifest (Media >Presentation Description, MPD) may include optional Role Descriptor >elements that are intended to function like Kind to descript Adaptation >Sets that result in tracks when streamed in an HTML5 browser using MSE. >The DASH standard was completed previous to W3C Kind specification, so >defines a slightly different vocabulary than that eventually settled on >by W3C. It also allows multiple Role Descriptors because an Adaptation >Set (track) may fit multiple descriptions, such as "Main" or "Alternate" >and "Description" or "Translation". The Role descriptor uses a URI/URN >to identify the vocabulary and syntax contained in the descriptor, so it >is extensible beyond the vocabulary defined in the DASH standard. > >An addition Accessibility Descriptor is specified in the DASH MPD schema >to allow automatic selection of audio, video, and TTML tracks for users >with visual, hearing, cognitive, etc. impairments. A URI/URN can be >selected that labels these tracks with identifiers established by >regulation, broadcast TV, etc., such as "SDH" for Subtitles for Deaf and >Hard of hearing. Even if a player does not recognize the particular >URI/URN or descriptive term used in this Descriptor, it can make a >default selection when a user preference setting indicates an impairment, >based on the presence of the Accessibility Descriptor, language >attribute, etc. It may also have a Descriptor indicating "alternate" or >similar, but that would not be very useful for someone who is visually >impaired or a standard player that would like to find an audio >description track. > >Selection of an Adaptation Set and a Representation contained in it for >adaptive streaming involves evaluating attributes that identify codec, >video resolution or audio track configuration, language, frame rate, >bitrate, etc. in addition to the Role or Kind. An Adaptation Set contains >perceptually equivalent content, but possibly multiple Representations >that are encoded differently to enable rapid switching to compensate for >variation in network throughput. The intent is that Media Segments >adaptively selected and sequenced from different Representations within >an Adaptation Set will appear to be a continuous track on playback, so >they share the same Role Descriptor. Although it is possible, it is >unlikely that a Subtitle Adaptation Set will contain more than one >Representation. > >A single AdaptationSet element (track) may by described by e.g. one >Accessibility Descriptor and two Role Descriptors indicating a TTML track >was character coded Hiragana for children and blind readers of touch >devices, and was descriptive, so also suitable for hearing impaired >Japanese. An alternative AdaptationSet (track) could be described by >both Accessibility and Role descriptors to describe painted Kanji glyphs, >more appropriate for adult hearing impaired readers, and more typical of >the majority of the world's cursive writing systems and subtitles used on >movies, video discs, and broadcast. > >Although there can be multiple descriptions of a track, there isn't >provision for multiple "sub-tracks" within a single TTML (or WebVTT?) >Adaptation Set or ISO Media track. > >There is one special case to consider, which is binary captions >encapsulated in AVC/HEVC elementary streams. A video track will act like >two tracks when broadcast content containing e.g. CEA-608 or CEA-708 or >Teletext, etc. is played on a device with the appropriate caption >decoder(s). These include iOS devices, game consoles, settop boxes, TVs, >etc. It would be useful to identify if these broadcast captions are >present and turn them on/off; but that may be in the scope of W3C groups >working on tuner APIs, etc. > >Kilroy Hughes | Senior Digital Media Architect |Windows Azure Media >Services | Microsoft Corporation > > >-----Original Message----- >From: Silvia Pfeiffer [mailto:silviapfeiffer1@gmail.com] >Sent: Monday, May 19, 2014 5:12 AM >To: Philip Jägenstedt >Cc: Jerry Smith (WINDOWS); Bob Lund; Paul Cotton; >public-html-admin@w3.org; Pierre-Anthony Lemieux >Subject: Re: HTML WG Note publication of sourcing in-band media resources > >On Mon, May 19, 2014 at 10:02 PM, Philip Jägenstedt <philipj@opera.com> >wrote: >> On Mon, May 19, 2014 at 1:29 PM, Silvia Pfeiffer >> <silviapfeiffer1@gmail.com> wrote: >>> On Mon, May 19, 2014 at 7:22 PM, Philip Jägenstedt <philipj@opera.com> >>>wrote: >> >>>> Finally, does ISO BMFF have SDH (subtitles for the deaf or >>>> hard-of-hearing) as a separate flag from the subtitle and captions >>>> kinds, or is possible to assign an arbitrary number of kinds to a >>>> track? Either way it doesn't sound like it maps 1:1 to the HTML >>>> track kinds. >>> >>> That's what I tried to say: since the ISO BMFF 'SDH' track contains >>> both 'SDH' and 'subtitles' cues, it should be mapped to both a >>> @kind='captions' track and also a @kind='subtitles' track where the >>> cues that are marked to be for SDH only are removed. >> >> Are the individual cues really marked with that metadata? If they >> aren't, then exposing such a single track with kind 'captions' seems >> like the correct mapping. > >I was under that impression, but I haven't been able to confirm this. >Maybe somebody else with actual MPEG4 specs can confirm / refute that >assumption? > >Cheers, >Silvia. >
Received on Tuesday, 3 June 2014 21:50:12 UTC