- From: Pierre-Anthony Lemieux <pal@sandflow.com>
- Date: Wed, 24 Jun 2015 20:33:59 -0700
- To: "public-tt@w3.org" <public-tt@w3.org>
Hi all, In preparation for our call and as discussed previously, below is a draft for a potential liaison to Unicode suggesting the addition of subtitle/captioning character sets to CLDR. Looking forward to the discussion. Best, -- Pierre """ The W3C Timed Text Working Group (TTWG) [1] develops specifications for subtitle and caption delivery worldwide, including dialog language translation, content description, captions for deaf and hard of hearing, etc. It has, in the process, collected sets of characters (for selected locales) that have proven useful in practice for subtitling and captioning. These sets, documented at [2], are derived in part from the analysis of home video content. [1] http://www.w3.org/AudioVideo/TT/ [2] https://dvcs.w3.org/hg/ttml/raw-file/tip/ttml-ww-profiles/ttml-ww-profiles.html#recommended-unicode-code-points-per-language The CLDR Core Data specifies sets of commonly used letters and punctuation (main, punctuation, numbering system...) for invididual locales. The TTWG notes that these sets do not include all characters used in practice for subtitling/captioning, e.g. the QUARTER NOTE (U+2669) character. TTWG suggests that Unicode consider adding to CLDR sets of characters useful for subtitling and captioning applications. These sets would evolve as new locales are added and existing locales are refined, and could be referenced by TTWG and other organizations, enhancing the chances that subtitles/captions are presented correctly across systems. The page at [3] details the suggested subtitle/captioning characters sets for a number of selected locale. Each set is a superset of the CLDR main, punctuation and numbers sets for the given locale. For reference, blue-shaded cells indicate characters that are already included in the latter. While it is possible to produce sets that exclude CLDR main, punctuation and numbers sets, such sets are probably more difficult to review. [3] http://sandflow.com/public/cldr/imsc-codepoint-table.htm TTWG is available to provide additional information and looks forward to hearing from, and working with, the Unicode consortium. """
Received on Thursday, 25 June 2015 03:34:49 UTC