- From: John Foliot <jfoliot@stanford.edu>
- Date: Mon, 22 Feb 2010 00:18:00 -0800 (PST)
- To: 'Philip Jägenstedt' <philipj@opera.com>, "'Dick Bulterman'" <Dick.Bulterman@cwi.nl>
- Cc: "'Silvia Pfeiffer'" <silviapfeiffer1@gmail.com>, "'Geoff Freed'" <geoff_freed@wgbh.org>, <public-html-a11y@w3.org>, <markku.hakkinen@gmail.com>, <symm@w3.org>
Hi All, Earlier Silvia asked if others would be responding. I've been following this thread closely and have some thoughts and opinions to add to the discussion. Philip Jägenstedt wrote: > > > I think this is actually a fair summary and applies to both smilText > and > DFXP with the modification "don't think we'll implement them in the > immediate future". Other implementors may feel differently. Is this to indicate that Opera will likely not support DFXP (or some variant there-of) in the immediate future? I think that support of some form of marked up transcript beyond SRT will be a requirement in the final recommendation, but that is but my opinion at this time. > > I think we should have one extremely simple format like SRT right now > and > eventually one on the far other end of the scale that can handle all > current use cases and is extensible for the future in some fashion. I think that "eventually" will be a lot closer than you might be giving it credit for. I am already producing a rudimentary DFXP file from the Stanford Captioning system - the file being auto-generated from accurate transcription text files. Since machines can generate XML marked files as easily as SRT, the question of 'ease of authoring' will likely become moot sooner than later. Given the obvious benefits of XML marked files w.r.t. the styling, accessibility, etc. there will likely be a move towards a richer markup of the transcripts early on, especially for large commercial producers. (Impending legislation such as H.R. 3101 will have a huge impact in the US on advancing this if/when passed into law) > However, there must be some order of priorities and I think reaching a > consensus on a complex format and having inter-operable implementations > shipped in several browsers is still years away. ...and so I would suggest that a high priority of ensuring that an XML marked transcript format, as a required format that browsers MUST support, is important - no IMHO it's critical. SmilText, full blown DFXP, or something different again; what is important is that *one* such format is supported out of the gate. With regard to SRT: We simply cannot deny that there is already an ample body of evidence that shows that SRT has gained a foothold 'in the wild'. I wrote earlier of 'the kids' (not to imply any disparaging observations) who grabbed SRT and ran with it. These content creators are generating text from speech voluntarily, and 'time stamping' them by hand, or with very simple authoring tools that still involve a fair bit of hands-on manipulation. This cannot be ignored. Right now, getting transcripts of any flavor is *WAY* more difficult than getting them in a specific format. Some days I'm almost ready to take the transcript scribbled on the back of a napkin they are so rare. Thus any group who are already generating time-stamped files of any stripe have my support. For this reason alone, I must agree that support of SRT is important as we roll this out. Will it become obsolete fairly quickly? I suspect so, but by the time we have a web that offers captioned videos as a matter of course, if 20% of those timed transcripts are in SRT I won't complain - as SRT might just have had a hand in advancing the other 80%. So, the real question as I see it is: Is SmilText going to be robust enough to last us/the web for at least 5 or 6 years (perhaps longer)? Is it going to be rich enough to support a higher end delivery requirement, knowing full well that the browser will always be the limitation here. While I am not fully versed in the full richness of DFXP, I do know that it is seen as overly heavy for what we need in this case. Silvia mentioned that perhaps a profile of DFXP tailored to how/what is being implemented in browsers (as opposed to other media players) would likely be of benefit here. I pose this question to those closer to DFXP/SmilText - if we can only get one XML marked file format into the spec for now, what is our best bet? Is it SmilText, or do we need to wood-shed a bit more and come back with something else again, based upon DFXP, but 'skinnier'? (If anyone is interested in seeing the auto-generated DFXP file I am generating, please contact me off list and I will forward a copy directly) JF
Received on Monday, 22 February 2010 08:18:36 UTC