- From: Silvia Pfeiffer <silviapfeiffer1@gmail.com>
- Date: Wed, 27 Oct 2010 22:54:59 +1100
On Wed, Oct 27, 2010 at 8:53 PM, Philip J?genstedt <philipj at opera.com> wrote: > On Fri, 22 Oct 2010 13:49:00 +0200, Silvia Pfeiffer > <silviapfeiffer1 at gmail.com> wrote: > >> On Fri, Oct 22, 2010 at 10:18 PM, Simon Pieters <simonp at opera.com> wrote: >>> >>> On Fri, 22 Oct 2010 13:09:24 +0200, Philip J?genstedt <philipj at opera.com> >>> wrote: >>> >>>>> Using <!-- --> is a bad idea since the WebSRT syntax already uses -->. >>>>> I >>>>> don't see the need for multiline comments. >>>> >>>> Right. If we must have comments I think I'd prefer /* ... */ since both >>>> CSS and JavaScript have it, and I can't see that single-line comments >>>> will >>>> be easier from a parser perspective. >>> >>> Line comments seem better from a compat perspective (you wouldn't get >>> commented out stuff appear as cues in legacy parsers). >> >> Philip's research earlier from this thread was as follows: >> >> ; appears at the beginning of lines in 15/10000 files and most don't look >> like they're intended as comments. >> >> # appears at the beginning of lines in 244/10000 files and most don't look >> like they're intended as comments. >> >> /* only appears in 3/10000 files, so CSS-style comments might work, but >> does add some complexity >> >> // appears at the beginning of lines in 5/10000 files and most look like >> that *are* intended as comments or are garbage, so it should work. >> >> (data from OpenSubtitles sample) >> >> which seems to support the choice of //. > > Note that this was assuming that WebSRT should be an extension of SRT. If > that's not true, we can choose more freely. > >> I do wonder what the lines that start with ; or # contained though. > > ; look mostly like typos, sometimes where " was intended. > > # seems to have been mostly used as some kind of emphasis, with # sentences > like this # > > Note, that lots of the files are in languages and encodings unknown to me, > so my guesses shouldn't be taken too seriously. It's obvious that if WebSRT > is an extension of SRT (which I no longer think is a good idea), then *some* > content will break. I recently came across the mpsub format, see http://www.mplayerhq.hu/DOCS/tech/mpsub.sub . It has name-value pairs at the start for file-wide metadata and uses # for comments. (It also has a weird time stamp format which I would ignore.) Actually, the name-value pairs make sense to me, and we could use the # for comments as an analogy to scripting languages, where # is often the sign for comments. OTOH we could use // and /* */ in analogy with C/C++ for comments which would cover both, single-line and multi-line comments and thus be more flexible. Silvia.
Received on Wednesday, 27 October 2010 04:54:59 UTC