- From: Glenn Maynard <glenn@zewt.org>
- Date: Thu, 27 Sep 2012 20:42:36 -0500
- To: Silvia Pfeiffer <silviapfeiffer1@gmail.com>
- Cc: David Singer <singer@apple.com>, Simon Pieters <simonp@opera.com>, public-texttracks <public-texttracks@w3.org>
- Message-ID: <CABirCh8_cHywhCmW_erbuMZS8o+9DYCshebC2kbsqe7Ci+Du1g@mail.gmail.com>
On Thu, Sep 27, 2012 at 7:06 PM, Silvia Pfeiffer <silviapfeiffer1@gmail.com>wrote: > If we finish the header area of a WebVTT file not on a blank line, but > on the first valid cue, then we don't need to escape anything really, > because it is quite unlikely to have a "time --> time" pattern in > anything but a cue. We might want to escape "-->" if necessary, but > that's all. > FWIW, it would be nicer to instead change the "-->" error recovery rule for the header loop to something more specific ("--> that isn't within a multiline header, or on a "Key: .*" single-line header), and to use a unique line (eg. ".") to end the header. That avoids needing mid-line escapes (eg. -->), so only a single escape mechanism is needed. This can work if the parser knows about the format of headers, so the following (and variations) is parsable: > Font: http://fonts.com/my-->font.ttf > Style: > .foo { bar: "a --> b"; }; > > .foo2 { bar2 }; > . > 00:01.000 --> 00:02.000 > text If the parser understands the format of headers, it can figure out that we're not, in fact, breaking out of the header region and into cues on that blank line. It can understand that the first two "-->" are probably not mis-authored cues, since it's in a header and it's in the middle of a header block. It can also detect that the last --> at the bottom *is* a mis-authored cue (that is, the blank line before the first cue is missing), since it's not within a header block. This maintains the error-recovery for the most common errors (forgetting the blank line), and doesn't require escaping anything except a lone "." (and the quote itself). However (as we've talked about before) this would require backwards-incompatible changes to the parser. Current parsers would drop out of the header loop at the first "-->", and if those wasn't there they'd drop out at the blank line. That's going to apply to anything that doesn't require escaping blank lines and/or -->. (That's the reason we went down the other escaping path in the first place.) -- Glenn Maynard
Received on Friday, 28 September 2012 01:43:04 UTC