W3C home > Mailing lists > Public > whatwg@whatwg.org > October 2010

[whatwg] Timed tracks: feedback compendium

From: Simon Pieters <simonp@opera.com>
Date: Fri, 22 Oct 2010 11:45:24 +0200
Message-ID: <op.vky1dyccidj3kv@simon-pieterss-macbook.local>
On Fri, 22 Oct 2010 11:21:44 +0200, Silvia Pfeiffer  
<silviapfeiffer1 at gmail.com> wrote:

> Since the attributes in <track> are a hint, probably what is available
> in the file should overrule what is in the <track> attributes. It is
> the same for the @charset attribute, which is overruled to utf-8 for
> WebSRT IIRC.

No, charset="" overrules the encoding for WebSRT per spec.


>>> * add a means to add comments
>>>
>>> e.g.
>>> // Lines starting with // are comments
>>
>> So far the web two comment syntaxes: <!-- SGML style --> and /* CSS  
>> style
>> */, so if we need comments I think we should pick one of these.

Actually there are three more in javascript:

// line comment
<!-- line comment
--> line comment

http://wiki.whatwg.org/wiki/Web_ECMAScript#HTML_comments


> I'm not fussed. I thought your analysis pointed to //, which is also
> nicer because it takes the full line into account without a need for
> end tags. Also, it is common from C++ and other programming languages.
> But I don't really mind - we just need a decision and reasons for why.

Using <!-- --> is a bad idea since the WebSRT syntax already uses -->. I  
don't see the need for multiline comments.



>> Anyway, I agree that at least a magic header like "WebSRT" is needed  
>> because
>> of the horrors of legacy SRT parsing.

I don't see why we can't just consume the legacy and support it in WebSRT.  
Part of the point with WebSRT is to support the legacy. If we don't want  
to support the legacy, then the format can be made a lot cleaner.


>> Breaking SRT compat means that we can
>> go back to requiring UTF-8 as the encoding. However, UTF-8 does  
>> complicate
>> the magic header a bit due to the possibility of a BOM [1]. While it  
>> would
>> be nice to forbid the use of a BOM, I expect we'd then see lots of
>> frustration from authors who's editors automatically insert it...
>>
>> [1] http://en.wikipedia.org/wiki/Byte_order_mark#UTF-8
>
> I'm happy to enforce UTF-8 on WebSRT. The @charset can work for other
> formats. I didn't know about the BOM problem - but having read it, I
> would think it makes sense to forbid it. What tools do and how they
> deal with erroneous files is a different matter.

Forbidding it would be the frustration. Consider editing a WebSRT file in  
Notepad, and then suddenly it doesn't work anymore. Instead we should  
allow the BOM. (WebSRT already allows the BOM.)

-- 
Simon Pieters
Opera Software
Received on Friday, 22 October 2010 02:45:24 UTC

This archive was generated by hypermail 2.3.1 : Monday, 13 April 2015 23:09:01 UTC