W3C home > Mailing lists > Public > public-html-bugzilla@w3.org > February 2011

[Bug 12076] Wishlist: line-based parser

From: <bugzilla@jessica.w3.org>
Date: Tue, 15 Feb 2011 09:44:57 +0000
To: public-html-bugzilla@w3.org
Message-Id: <E1PpHSr-0007ye-HD@jessica.w3.org>

Philip Jägenstedt <philipj@opera.com> changed:

           What    |Removed                     |Added
                 CC|                            |philipj@opera.com

--- Comment #1 from Philip Jägenstedt <philipj@opera.com> 2011-02-15 09:44:56 UTC ---
I'm calling this a wishlist item because it is editorial. Still, here's my

When I made a JavaScript implementation of the earlier WebSRT parser, I found
it quite hard to follow the steps because of how handling of CRLF is sprinkled
all over, and even found a spec bug related to it (fixed already). Of course
the spec should be precise down to every single byte what should happen, but
I'm hoping that could happen with a line-based parser as well.

If it's not obvious, by a line-based parser I mean one which operates on the
input and generates lines for a second step. This wouldn't harm streaming,
because AFAICT no cues will be output from the parser before CRLF or EOF is
encountered anyway.

I dare say this makes it more likely that implementations of WebVTT in
high-level languages like JavaScript and Python will actually follow the spec,
since operating on lines is quite easier to understand for a format like
WebVTT. If you go and look for random SRT parsers, I think you'll find that
most work like this. (The ones I've written do anyway.)

The spec is already mostly line-based, I'm just suggesting that the
line-splitting be separated out from the rest to improve readability. Do as you

Configure bugmail: http://www.w3.org/Bugs/Public/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the QA contact for the bug.
Received on Tuesday, 15 February 2011 09:44:59 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 20:01:41 UTC