- From: Christian Vogler <christian.vogler@gallaudet.edu>
- Date: Tue, 8 Apr 2014 12:55:38 -0400
- To: Philip Jägenstedt <philipj@opera.com>
- Cc: "public-texttracks@w3.org" <public-texttracks@w3.org>, Ken Harrenstien <klh@google.com>, Jean-Baptiste Kempf <jb@videolan.org>
- Message-ID: <CAHVQVp2m6Pqi4hsYV+71DCuQZ3mZNGD_djT9SeaSEgPNHqxsjA@mail.gmail.com>
In CEA-608 they are defined in tables 5-10 (at least in the older revision 608a, which I got for cheap). Hi == 0x12 and low between 20-3f are Spanish, misc, and French special characters, Hi == 0x13 are Portuguese, German, and Danish characters. A full list also seems to be here: http://en.wikipedia.org/wiki/EIA-608#Characters In practice - for the US CEA 608 to WebVTT conversion - only 0x12 20 to 0x12 2f are relevant. The mapping for these to Unicode code points is encapsulated in this table that I use, indexed starting from 0x20 through 0x2f: /* CC non-Latin1 code mappings */static const uint16_t specialchar[] = { 174 /* ® */, 176 /* ° */, 189 /* ½ */, 191 /* ¿ */, 0x2122 /* ™ */, 162 /* ¢ */, 163 /* £ */, 0x266A /* ♪ */, 224 /* à */, TRANSP_SPACE,232 /* è */, 226 /* â */, 234 /* ê */, 238 /* î */, 244 /* ô */, 251 /* û */}; TRANSP_SPACE is a non-breaking space, and doesn't correspond to any Unicode code point. For muxing, my understanding is that the CPC authoring tools do that kind of thing. Christian On Apr 8, 2014 7:57 AM, "Philip Jägenstedt" <philipj@opera.com> wrote: > Thanks jb, that looks useful as a reference. Would you happen to know > where the extended characters "THAT COME FROM HI BYTE=0x12 AND LOW > BETWEEN 0x20 AND 0x3F" and "THAT COME FROM HI BYTE=0x13 AND LOW > BETWEEN 0x20 AND 0x3F" are actually defined? Also, do you know of a > tool to mux arbitrary 608 data with an existing caption-less file for > testing? > > Philip > > On Tue, Apr 8, 2014 at 12:27 PM, Jean-Baptiste Kempf <jb@videolan.org> > wrote: > > VLC has a 608 decoder, that should support all 608, including roll-up > > captions (a contrario from Xine) and tested with actual NTSC streams. > > http://git.videolan.org/?p=vlc.git;a=blob;f=modules/codec/cc.c;hb=HEAD > > > > On 01 Apr, Christian Vogler wrote : > >> There are also at least two open-source projects that have CEA-608 > caption > >> decoders - Xine (for the subset that is used on DVDs in the libspucc/ > >> source folder) and CCExtractor. Xine doesn't support roll-up captions, > >> since they never appear on DVDs, but it handles pretty much everything > >> else, including a file that Giovanni Galvez threw at me for testing a > >> couple years ago. > >> > >> Christian > >> > >> > >> > >> > >> > >> > >> On Mon, Mar 31, 2014 at 2:39 PM, Philip Jägenstedt <philipj@opera.com > >wrote: > >> > >> > Thank you Ken! > >> > > >> > I remember now that SCC was one of the standalone 608 formats you > >> > mentioned at FOMS. The raw essence is exactly what I'm interested in, > >> > so that sounds very promising. > >> > > >> > I've asked to order a copy of "The Closed Captioning Handbook" for my > >> > office, it looks very relevant to what I do. > >> > > >> > Philip > >> > > >> > On Tue, Apr 1, 2014 at 1:10 AM, Ken Harrenstien <klh@google.com> > wrote: > >> > > Giovanni Galvez is still there and still super helpful. > >> > > Their software is widely used in the industry for format > >> > > conversion. > >> > > > >> > > We host one of their demo videos at > >> > > http://www.youtube.com/watch?v=BbqPe-IceP4 > >> > > > >> > > and I'm sure Giovanni can send you the corresponding SCC > >> > > files for that or any other demo video they have. The reason > >> > > I suggest SCC is that this format contains the raw 608 essence > >> > > that we care about; in fact, this is YouTube's preferred upload > >> > > format for movie/TV content. If you want to know how to extract > those > >> > > bytes from a video file, then you have a much harder task given > >> > > the multitude of video containers and formats. > >> > > > >> > > And yes, the CFR link, while terse, does contain pretty much all > >> > > of the important bits. The CEA documents are mostly about > >> > > XDS data, which has nothing to do with captions. For > >> > > purposes of WebVTT conversion a much better place to start > >> > > learning about 608 is the "Closed Captioning Handbook" by Gary > Robson, > >> > > which should still be available on Amazon. I like it because it's > >> > > very readable and has so much other interesting context. > >> > > > >> > > On the other hand, if you plan to implement some kind of > >> > > cable set-top box, then yes, you'll need the CEA documents > >> > > plus several other specs. > >> > > > >> > > --Ken > >> > > > >> > > > >> > > On Mon, Mar 31, 2014 at 8:58 AM, Philip Jägenstedt < > philipj@opera.com> > >> > > wrote: > >> > >> > >> > >> Do you mean http://www.cpcweb.com/webcasts/webcast_samples.htm ? > >> > >> > >> > >> What I'm looking for is the actual video file that contains the 608 > >> > >> data, preferably with some clue about how to extract it as well :) > >> > >> > >> > >> Philip > >> > >> > >> > >> On Mon, Mar 31, 2014 at 10:16 PM, Christian Vogler > >> > >> <christian.vogler@gallaudet.edu> wrote: > >> > >> > Gio Galvez at CPC did a video like that. His company was bought > out, > >> > but > >> > >> > it > >> > >> > might still be possible to get access. Should I ask? > >> > >> > > >> > >> > Sent from my mobile phone. Please excuse any touchscreen-induced > >> > >> > weirdness. > >> > >> > > >> > >> > On Mar 31, 2014 9:53 AM, "Philip Jägenstedt" <philipj@opera.com> > >> > wrote: > >> > >> >> > >> > >> >> Hi all, > >> > >> >> > >> > >> >> Does anyone have access to 608 caption data and recommendations > for > >> > >> >> software that is known to render it correctly? I'd like to > understand > >> > >> >> the 608 model at the lowest level, but it's hard without > examples. > >> > I'm > >> > >> >> guessing that people who have worked on 608 to WebVTT already > have > >> > >> >> sample files and scripts to process them, so anything like that > would > >> > >> >> be appreciated. > >> > >> >> > >> > >> >> Also, the spec is incredibly brief, is there really nothing > better > >> > than > >> > >> >> this? > >> > >> >> > http://edocket.access.gpo.gov/cfr_2007/octqtr/pdf/47cfr15.119.pdf > >> > >> >> > >> > >> >> Philip > >> > >> >> > >> > >> > > >> > >> > >> > > > >> > > >> > >> > >> > >> -- > >> Christian Vogler, PhD > >> Director, Technology Access Program > >> Department of Communication Studies > >> SLCC 1116 > >> Gallaudet University > >> http://tap.gallaudet.edu/ > >> VP: 202-250-2795 > > > > -- > > With my kindest regards, > > > > -- > > Jean-Baptiste Kempf > > http://www.jbkempf.com/ - +33 672 704 734 > > Sent from my Electronic Device >
Received on Tuesday, 8 April 2014 16:56:03 UTC