[whatwg] Timed tracks: feedback compendium

Another note on WebSRT:

seeing the addition of the <bdi> element into HTML, we probably also
need to add that to WebSRT cue level markup to allow bidirectional
text formatting.
http://www.whatwg.org/specs/web-apps/current-work/multipage/text-level-semantics.html#the-bdi-element

Cheers,
Silvia.


On Wed, Oct 27, 2010 at 10:54 PM, Silvia Pfeiffer
<silviapfeiffer1 at gmail.com> wrote:
> On Wed, Oct 27, 2010 at 8:53 PM, Philip J?genstedt <philipj at opera.com> wrote:
>> On Fri, 22 Oct 2010 13:49:00 +0200, Silvia Pfeiffer
>> <silviapfeiffer1 at gmail.com> wrote:
>>
>>> On Fri, Oct 22, 2010 at 10:18 PM, Simon Pieters <simonp at opera.com> wrote:
>>>>
>>>> On Fri, 22 Oct 2010 13:09:24 +0200, Philip J?genstedt <philipj at opera.com>
>>>> wrote:
>>>>
>>>>>> Using <!-- --> is a bad idea since the WebSRT syntax already uses -->.
>>>>>> I
>>>>>> don't see the need for multiline comments.
>>>>>
>>>>> Right. If we must have comments I think I'd prefer /* ... */ since both
>>>>> CSS and JavaScript have it, and I can't see that single-line comments
>>>>> will
>>>>> be easier from a parser perspective.
>>>>
>>>> Line comments seem better from a compat perspective (you wouldn't get
>>>> commented out stuff appear as cues in legacy parsers).
>>>
>>> Philip's research earlier from this thread was as follows:
>>>
>>> ; appears at the beginning of lines in 15/10000 files and most don't look
>>> like they're intended as comments.
>>>
>>> # appears at the beginning of lines in 244/10000 files and most don't look
>>> like they're intended as comments.
>>>
>>> /* only appears in 3/10000 files, so CSS-style comments might work, but
>>> does add some complexity
>>>
>>> // appears at the beginning of lines in 5/10000 files and most look like
>>> that *are* intended as comments or are garbage, so it should work.
>>>
>>> (data from OpenSubtitles sample)
>>>
>>> which seems to support the choice of //.
>>
>> Note that this was assuming that WebSRT should be an extension of SRT. If
>> that's not true, we can choose more freely.
>>
>>> I do wonder what the lines that start with ; or # contained though.
>>
>> ; look mostly like typos, sometimes where " was intended.
>>
>> # seems to have been mostly used as some kind of emphasis, with # sentences
>> like this #
>>
>> Note, that lots of the files are in languages and encodings unknown to me,
>> so my guesses shouldn't be taken too seriously. It's obvious that if WebSRT
>> is an extension of SRT (which I no longer think is a good idea), then *some*
>> content will break.
>
>
> I recently came across the mpsub format, see
> http://www.mplayerhq.hu/DOCS/tech/mpsub.sub . It has name-value pairs
> at the start for file-wide metadata and uses # for comments. (It also
> has a weird time stamp format which I would ignore.) Actually, the
> name-value pairs make sense to me, and we could use the # for comments
> as an analogy to scripting languages, where # is often the sign for
> comments. OTOH we could use // and /* */ in analogy with C/C++ for
> comments which would cover both, single-line and multi-line comments
> and thus be more flexible.
>
> Silvia.
>

Received on Friday, 5 November 2010 05:32:26 UTC