[whatwg] Fwd: Discussing WebSRT and alternatives/improvements from Silvia Pfeiffer on 2010-08-11 (public-whatwg-archive@w3.org from August 2010)

From: Silvia Pfeiffer <silviapfeiffer1@gmail.com>
Date: Wed, 11 Aug 2010 23:09:34 +1000
Message-ID: <AANLkTimmNM=9AmAyn4Y0w2qw4L0VH1W4xfX9ArDw4WUr@mail.gmail.com>
On Wed, Aug 11, 2010 at 9:49 PM, Anne van Kesteren <annevk at opera.com> wrote:

> On Wed, 11 Aug 2010 13:35:30 +0200, Silvia Pfeiffer <
> silviapfeiffer1 at gmail.com> wrote:
>
>> On Wed, Aug 11, 2010 at 7:31 PM, Anne van Kesteren <annevk at opera.com>
>> wrote:
>>
>>> While players are transitioning to WebSRT they will ensure that they do
>>> not break with future versions of the format.
>>>
>>
>> That's impossible, since we do not know what future versions will look
>> like and what features we may need.
>>
>
> If that is impossible it would be impossible for HTML and CSS too. And
> clearly it is not.


HTML and CSS have predefined structures within which their languages grow
and are able to grow. WebSRT has newlines to structure the format, which is
clearly not very useful for extensibility. No matter how we turn this, the
xml background or HTML and the name-value background of CSS provide them
with in-built extensibility, which WebSRT does not have.



>
>  I'm pretty sure that several will break. We cannot just test a handful of
>> available applications and if they don't break assume none will. In fact,
>> all existing applications that get loaded with a WebSRT file with extended
>> features will display text with stuff that is not expected - in particular
>> if the "metadata" case is used. And wrong rendering is bad, e.g. if it's
>> part of a production process, burnt onto the video, and shipped to
>> hearing-impaired customers. Or stored in an archive.
>>
>
> Sure, that's why the tools should be updated to support the standard format
> instead rather than each having their own variant of SRT.
>

They don't have their own variant of SRT - they only have their own parsers.
Some will tolerate crap at the end of the "-->" line. Others won't. That's
no break of "conformance" to the basic "spec" as given in
http://en.wikipedia.org/wiki/SubRip#SubRip_text_file_format . They all
interoperate on the basic SRT format. But they don't interoperate on the
WebSRT format. That's why WebSRT has to be a new format.



>
> (And if they really just take in text like that they should at least run
> some kind of validation so not all kinds of garbage can get in.)


That's not a requirement of the "spec". It's requirement is to render
whatever characters are given in cues. That's why it is so simple.



>
>  I don't think so. It just makes things more complex for authors (learn two
>>> formats,
>>>
>>
>> I see that as an advantage: I can learn the simple format and be off to a
>> running start immediately. Then, when I find out that I need more
>> features, I can build on top of already existing knowledge for the richer
>> format and can convert my old files through a simple renaming of the
>> resources.
>>
>
> Or could you learn the simple format from a tutorial that only teaches that
> and when you see someone else using more complex features you can just copy
> and paste them and use them directly. This is pretty much how the web works.


Sure. All I need to do is rename the file. Not much trouble at all. Better
than believing I can just copy stuff from others since it's apparently the
same format and then it breaks the SRT environment that I already have and
that works.



>
>  have to convert formats (i.e. change mime) in order to use new features
>>> (which could be as simple as a <ruby> fragment for some Japanese track)
>>>
>>
>> If I know from the start that I need these features, I will immediately
>> learn WebSRT.
>>
>
> But you don't.


Why? If I write Japanese subtitles and my tutorial tells me they are not
supported in SRT, but only in WebSRT, then I go for WebSRT. Done.



>
>  , more complex for implementors (need two separate implementations as to
>>> not encourage authors to use features of the more complex one in the less
>>> complex one), more complex for conformance checkers (need more code),
>>> etc.
>>> Seems highly suboptimal to me.
>>>
>>
>> That's already part of Ian's proposal: it already supports multiple
>> different approaches of parsing cues. No extra complexity here.
>>
>
> Actually that is not true. There is only one approach to parsing in Ian's
> proposal.



A the moment, cues can have one of two different types of content:
(see
http://www.whatwg.org/specs/web-apps/current-work/multipage/video.html#syntax-0

"6. The cue payload: either WebSRT cue
text<http://www.whatwg.org/specs/web-apps/current-work/multipage/video.html#websrt-cue-text>
 or WebSRT metadata
text<http://www.whatwg.org/specs/web-apps/current-work/multipage/video.html#websrt-metadata-text>
."

So that means in essence two different parsers.



>
>  My theory is: we only implement support for WebSRT in the browser - that
>> it happens to also support SRT is a positive side effect. It works for the
>> Web - and it works for the existing SRT communities and platforms. They know
>> they have to move to WebSRT in the long run, but right now they can get
>> away with simple SRT support and still deliver for the Web. And they have a
>> growth path into a new file format that provides richer features.
>>
>
> This is the proposal. That they are the same format should not matter.


It matters to other applications, see
http://forum.doom9.org/showthread.php?p=1396576 . We should tolerate that.

Cheers,
Silvia.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.whatwg.org/pipermail/whatwg-whatwg.org/attachments/20100811/6d832678/attachment-0001.htm>
Received on Wednesday, 11 August 2010 06:09:34 UTC