Re: [Draft 4] Transition request for VTT to Candidate rec. from Nigel Megitt on 2018-01-12 (public-tt@w3.org from January 2018)

From: Nigel Megitt <nigel.megitt@bbc.co.uk>
Date: Fri, 12 Jan 2018 19:04:52 +0000
To: David Singer <singer@apple.com>, TTWG <public-tt@w3.org>
CC: Philippe Le Hegaret <plh@w3.org>, Silvia Pfieffer <silviapfeiffer1@gmail.com>
Message-ID: <D67E9D3D.52810%nigel.megitt@bbc.co.uk>
In the section that describes the CG/WG working model, it would probably
be helpful for the transition to explain the current delta between the CG
and the WG versions of the document (which may be none). Feels to me that
this would be good practice in this CG/WG model, and we're forging the way
there. Happy to hear other views.

The SOTD section still mentions WD - there's standard boilerplate for CR
SOTD that just needs to be used - it can be left to the Editor I think to
do that.

The minutes don't include a resolution about the exit criteria so we will
need to make sure that is covered when we do resolve to transition to CR.
What we did say though was that we'd put the WPT results in a single
Implementation Report for simplicity of review.

On the MAUR requirements:


[CC-6] Is it worth pointing out that WebVTT does not support overlapping
regions? (not that such a requirement is explicitly called out in the
MAUR, or even exists, just that it is a constraint if I understand
correctly)

[CC-13] not sure how caption background colours can be kept visible when
there's no caption text visible? This may be a lack of understanding on my
part, but I'd be interested to know.

[CC-14] (paint on) and [CC-27] cannot both be delivered together if I've
read the responses correctly - in other words there is no model for
appending words to cues in a live environment.

Kind regards,

Nigel
 


On 12/01/2018, 16:57, "singer@apple.com on behalf of David Singer"
<singer@apple.com> wrote:

>Hi guys
>
>Updated based on reviewing the minutes.
>
>To proceed further:
>"to proceed with the CR transition request (a) response from the
>commenter, or feb 15th, whichever is sooner, (b) the revised transition
>request and (c) a document prepared as a rec-track document (not CG
>report)² <https://www.w3.org/2018/01/10-tt-minutes.html#item50>
>
>* we need to be sure we¹ve resolved WR comments and reached Feb 15th or
>consensus with the commenter;
>* we need a Rec track document prepared with an updated SOTD
>* and we need the final details in this transition request
>* and the formal resolution of the WG
>
>
>(Based on a recent TTML advancement request).
>
>* * * *
>
>Dear Director, Philippe,
>
>This is a Transition Request from the Timed Text WG for publication of a
>Candidate Recommendation of the "WebVTT: The Web Video Text Tracks
>Format".
>
>Transition details below.
>
>
>1. Boilerplate ...
>
>* Document title: WebVTT: The Web Video Text Tracks Format
>
>* Document URI:
>
>https://w3c.github.io/webvtt/ (currently formatted as a CG community
>report, but this is an editorial issue easily handled)
>
>
>Latest published version:
>
>https://w3c.github.io/webvtt/
>
>
>* Estimated publication date: TBD
>
>* Abstract:
>This specification defines WebVTT, the Web Video Text Tracks format. Its
>main use is for marking up external text track resources in connection
>with the HTML <track> element. WebVTT files provide captions or subtitles
>for video content, and also text video descriptions [MAUR], chapters for
>content navigation, and more generally any form of metadata that is
>time-aligned with audio or video content.
>
>* SotD:
>
>Work on this specification is being undertaken both in the Web Media Text
>Tracks Community Group as well as in the W3C Timed Text Working Group.
>The latter group works towards a W3C Recommendation for reference
>purposes with interoperability requirements, while the former is a Draft
>Community Group Report that continues to evolve.
>
>This document was published by the W3C Timed Text Working Group as a
>Working Draft. This document is intended to become a W3C Recommendation.
>If you wish to make comments regarding this document, please send them to
>public-tt@w3.org (subscribe, archives) with [webvtt] at the start of your
>email¹s subject. All comments are welcome.
>
>Publication as a Working Draft does not imply endorsement by the W3C
>Membership. This is a draft document and may be updated, replaced or
>obsoleted by other documents at any time. It is inappropriate to cite
>this document as other than work in progress.
>
>This document was produced by a group operating under the 5 February 2004
>W3C Patent Policy. W3C maintains a public list of any patent disclosures
>made in connection with the deliverables of the group; that page also
>includes instructions for disclosing a patent. An individual who has
>actual knowledge of a patent which the individual believes contains
>Essential Claim(s) must disclose the information in accordance with
>section 6 of the W3C Patent Policy.
>
>This document is governed by the 1 March 2017 W3C Process Document.
>
>For this specification to exit the CR stage, at least 2 independent
>implementations of every feature defined in this specification need to be
>documented in the implementation report which will include the
>WebPlatformTests results at https://wpt.fyi/webvtt. The WPT assumes a
>browser context and many implementations are not such, so input will be
>garnered from these other implementations manually; the implementation
>report may also be based on implementer-provided test results for the
>exit criteria test suite. The Working Group does not require that
>implementations are publicly available but encourages them to be so.
>
>The Working Group has not identified features "at risk" for this
>specification; there are some features not widely implemented yet but the
>group considers them important and not droppable.
>
>Substantive changes since FPWD
>
>see 
><https://www.w3.org/wiki/TimedText/WebVTT_Wide_Review#Substantive_changes_
>since_.28Second.29_Wide_Review_Review>
>
>2. Record of the WG's Decision to request the CR Transition:
>
>
>TBD
>
>
>3. Evidence that the document satisfies Group's Requirements:
>
>The media accessibility user requirements were defined for this
>specification in the Timed Text Working Group's charter at
>
>https://www.w3.org/2016/05/timed-text-charter.html
>
>"	€ Should address the Media Accessibility User Requirements.²
>https://www.w3.org/TR/media-accessibility-reqs/
>
>
>[CC-1] Render text in a time-synchronized manner, using the media
>resource as the timebase master.
>- satisfied
>
>[CC-2] Allow the author to specify erasures, i.e., times when no text is
>displayed on the screen (no text cues are active).
>- satisfied
>
>[CC-3] Allow the author to assign timestamps so that one caption/subtitle
>follows another, with no perceivable gap in between.
>- satisfied
>
>[CC-4] Be available in a text encoding.
>- satisfied
>
>[CC-5] Support positioning in all parts of the screen - either inside the
>media viewport but also possibly in a determined space next to the media
>viewport. This is particularly important when multiple captions are on
>screen at the same time and relate to different speakers, or when
>in-picture text is avoided.
>- satisfied, but the captioning area has to be part of a media viewport.
>It¹s not legal to paint outside ones viewport.
>
>[CC-6] Support the display of multiple regions of text simultaneously.
>- satisfied
>
>[CC-7] Display multiple rows of text when rendered as text in a
>right-to-left or left-to-right language.
>- satisfied
>
>[CC-8] Allow the author to specify line breaks.
>- satisfied
>
>[CC-9] Permit a range of font faces and sizes.
>- satisfied
>
>[CC-10] Render a background in a range of colors, supporting a full range
>of opacity levels.
>- satisfied
>
>[CC-11] Render text in a range of colors. The user should have final
>control over rendering styles like color and fonts; e.g., through user
>preferences. 
>- satisfied
>
>[CC-12] Enable rendering of text with a thicker outline or a drop shadow
>to allow for better contrast with the background.
>- satisfied (possibly only in CSS UAs?)
>
>[CC-13] Where a background is used, it should be possible to keep the
>caption background visible even in times where no text is displayed, such
>that it minimizes distraction. However, where captions are infrequent the
>background should be allowed to disappear to enable the user to see as
>much of the underlying video as possible.
>- satisfied, under author control
>
>[CC-14] Allow the use of mixed display styles‹ e.g., mixing paint-on
>captions with pop-on captions‹ within a single caption cue or in the
>caption stream as a whole. Pop-on captions are usually one or two lines
>of captions that appear on screen and remain visible for one to several
>seconds before they disappear. Paint-on captions are individual
>characters that are "painted on" from left to right, not popped onto the
>screen all at once, and usually are verbatim. Another often-used caption
>style in live captioning is roll-up - here, cue text follows double
>chevrons ("greater than" symbols), and is used to identify different
>speakers. Each sentence "rolls up" to about three lines. The top line of
>the three disappears as a new bottom line is added, allowing the
>continuous rolling up of new lines of captions. When displaying captions
>using the paint-on style, it is important to ensure that the final words
>that are displayed are visible for enough time for them to be read.
>- paint-on is an artefact of old caption-creation and delivery systems.
>VTT and modern systems can emulate paint-on, but cues are delivered as a
>unit, not character-by-character
>
>[CC-15] Support positioning such that the edge of the captions is a
>sufficient distance from the nearest screen edge to permit readability
>(e.g., at least 1/12 of the total screen height above the bottom of the
>screen, when rendered as text in a right-to-left or left-to-right
>language). 
>- satisfied
>
>[CC-16] Use conventions that include inserting left-to-right and
>right-to-left segments within a vertical run (e.g. Tate-chu-yoko in
>Japanese), when rendered as text in a top-to-bottom oriented language.
>- satisfied
>
>[CC-17] Represent content of different natural languages. In some cases
>the inclusion of a few foreign words forms part of the original
>soundtrack, and thus needs to be so indicated in the caption. Also allow
>for separate caption files for different languages and on-the-fly
>switching between them. This is also a requirement for subtitles. See
>also [CC-20]
>- satisfied
>
>[CC-18] Represent content of at least those specific natural languages
>that may be represented with [Unicode 3.2], including common
>typographical conventions of that language (e.g., through the use of
>furigana and other forms of ruby text).
>- satisfied
>
>[CC-19] Present the full range of typographical glyphs, layout and
>punctuation marks normally associated with the natural language's
>print-writing system.
>- satisfied to the extent Unicode does this.
>
>[CC-20] Permit in-line mark-up for foreign words or phrases.
>- satisfied
>
>[CC-21] Permit the distinction between different speakers.
>- satisfied
>
>[CC-22] Support captions that are provided inside media resources as
>tracks, or in external files.
>- satisfied; webVTT can be delivered as a text file, or as a track in an
>MP4 file
>
>[CC-23] Ascertain that captions are displayed in sync with the media
>resource. 
>- satisfied
>
>[CC-24] Support user activation/deactivation of caption tracks.
>- this is a feature of the system that displays
>
>[CC-25] Support both edited and verbatim captions when available.
>- this is a question of labelling caption streams in the encapsulating
>environment
>
>[CC-26] Support multiple tracks of foreign-language subtitles including
>multiple subtitle tracks in the same foreign language.
>- this is a feature of the environment
>
>[CC-27] Support live-captioning functionality.
>- satisfied; VTT files can be delivered line at a time, if needed, as
>there are no Œbracketing¹ constructs
>
>[CC-28] Enable the bounding box of the background area to be extended by
>a preset distance relative to the foreground text contained with that
>background area.
>- satisfied
>
>
>[ECC-1] Support metadata markup for (sections of) timed text cues.
>- satisfied
>
>[ECC-2] Support hyperlinks and other activation mechanisms for
>supplementary data for (sections of) caption text.
>- satisfied
>
>[ECC-3]Support text cues that may be longer than the time available until
>the next text cue, thus providing overlapping text cues. In such
>instances, users should be enabled to decide whether they prefer to see
>overlapping text, or automatically shorten display time, or to have the
>media resource paused while the caption is displayed. Timing should be
>provided by the author, but the user should always be able to override
>the author's timings.
>- satisfied, but there is no practical way for users to override timings
>in any caption system known
>
>[ECC-4] Support timed text cues that are allowed to overlap with each
>other in time and be present on screen at the same time (e.g., those that
>come from the speech of different individuals). Also support timed text
>cues that are not allowed to overlap, so that playback will be paused in
>order to allow users to catch up with their reading.
>- satisfied
>
>[ECC-5] Allow users to define the reading speed and thus define how long
>each text cue requires, and whether media playback needs to pause
>sometimes to let them catch up on their reading.
>- this is a feature of the player rather than the caption expression
>language
>
>
>
>4. Evidence Dependencies With Other Groups Met:
>
>This specification has been extensively sent for review to external
>groups, most notably MPEG and FOMS, and they have not expressed any
>comment on dependencies. There are no changes in dependencies. The FOMS
>group are 3GPP SA4 are not listed in the charter, but have been kept
>informed.
>
>
>5. Evidence that the document has received Wide Review:
>
>Two extensive rounds of wide review were conducted, as documented in
>https://www.w3.org/wiki/TimedText/WebVTT_Wide_Review
>
>The implementations page (9, below) also gives evidence of review (by the
>implementers at least). The FOMS community
><http://www.foms-workshop.org/foms2017/> also has discussed and reviewed
>VTT though they are not a formal organization or in liaison.
>
>6. Evidence that Issues Have Been Formally Addressed:
>
>The tables in the wide review page, and the linked GitHub issues and
>BugZilla bugs, show the dispositions. As this is an active specification,
>questions and issues continue to be filed, but we believe all wide review
>and important issues have been considered.
>
>
>7. Objections:
>
>There have been no Formal Objection from a TTWG Member or other parties,
>during the preparation of this specification.
>
>There are [[at least]] two issues raised where the commenter does not
>agree:
>
>https://github.com/w3c/webvtt/issues/370 ‹ the commenter would like
>timestamps not to insist on three digits after the decimal point
>https://github.com/w3c/webvtt/issues/372  the commenter wishes the
>default background to be 100% black, not 80%
>
>
>8. Features marked as "at risk":
>
>The Working Group has not identified features "at risk" for this
>specification.
>
>
>9. Implementation Information:
>
>There is existing information on implementation (here) which will be
>updated in the CR period.
>
>Please see the Working Group's implementation report at
><https://www.w3.org/wiki/TimedText/EffortsAndSpecifications#WebVTT-based_e
>fforts_and_specifications>
>
>which includes the references to the web platform tests
>https://github.com/w3c/web-platform-tests/tree/master/webvtt and the
>results, and the canIUse information at <https://caniuse.com/#feat=webvtt>
>
>Only some implementations are browser-hosted; some are polyfills and some
>are independent, standalone, implementations in other players. Reporting
>of feature coverage by the non-browser implementations will be done
>manually during the CR period.
>
>
>10. Patent Disclosures: none
>
>https://www.w3.org/2004/01/pp-impl/34314/status ("No patent disclosures
>have been made for any specifications of this group.²)
>
>
>This document is governed by the 1 March 2017 W3C Process Document.
>
>
>Regards,
>
>David Singer, Chair of the Timed Text Working Group.
>Thierry Michel, Team Contact for the Timed Text Working Group.
>
>
Received on Friday, 12 January 2018 19:05:20 UTC