- From: Silvia Pfeiffer <silviapfeiffer1@gmail.com>
- Date: Fri, 22 Oct 2010 21:48:02 +1100
On Fri, Oct 22, 2010 at 8:45 PM, Simon Pieters <simonp at opera.com> wrote: > On Fri, 22 Oct 2010 11:21:44 +0200, Silvia Pfeiffer > <silviapfeiffer1 at gmail.com> wrote: > >> Since the attributes in <track> are a hint, probably what is available >> in the file should overrule what is in the <track> attributes. It is >> the same for the @charset attribute, which is overruled to utf-8 for >> WebSRT IIRC. > > No, charset="" overrules the encoding for WebSRT per spec. Hmm... that makes sense for legacy SRT files, but not for modern WebSRT files. >>> Anyway, I agree that at least a magic header like "WebSRT" is needed >>> because >>> of the horrors of legacy SRT parsing. > > I don't see why we can't just consume the legacy and support it in WebSRT. > Part of the point with WebSRT is to support the legacy. If we don't want to > support the legacy, then the format can be made a lot cleaner. I'd actually much prefer we make a clean new format that doesn't start with having to deal with all the legacy of SRT. It can still be inspired by it though so we don't have to change much. I'd be curious to hear what other things you'd clean up given the chance. >>> Breaking SRT compat means that we can >>> go back to requiring UTF-8 as the encoding. However, UTF-8 does >>> complicate >>> the magic header a bit due to the possibility of a BOM [1]. While it >>> would >>> be nice to forbid the use of a BOM, I expect we'd then see lots of >>> frustration from authors who's editors automatically insert it... >>> >>> [1] http://en.wikipedia.org/wiki/Byte_order_mark#UTF-8 >> >> I'm happy to enforce UTF-8 on WebSRT. The @charset can work for other >> formats. I didn't know about the BOM problem - but having read it, I >> would think it makes sense to forbid it. What tools do and how they >> deal with erroneous files is a different matter. > > Forbidding it would be the frustration. Consider editing a WebSRT file in > Notepad, and then suddenly it doesn't work anymore. Instead we should allow > the BOM. (WebSRT already allows the BOM.) OK, I missed that. Cheers, Silvia.
Received on Friday, 22 October 2010 03:48:02 UTC