Re: meta-data in the VTT file header, a strawman proposal

On Thu, Apr 19, 2012 at 8:51 PM, Silvia Pfeiffer
<silviapfeiffer1@gmail.com>wrote:

>  The difference between JSON and the RFC822 *header* specification is
> that JSON requires quotation marks around strings and does not allow
> newlines in values.
>

There's a good bit more than that.  JSON is a much simpler specification,
despite having a broader vocabulary (though RFC-822-style headers could
certainly be specified much more simply than RFC-822 itself does it, of
course).

Thus, IMO, RFC822 headers are actually an improvement over JSON.
> Similarly to JSON, everyone has a RFC822 header parser available. All
> values are inherently strings, but converted to their proper type by
> interpretation of the name.
>

(I'm not at all, wedded to JSON and I'm not even convinced it's the right
idea myself, but I think it's worthwhile to play out the idea.)

JSON parsers are far more ubiquitous than RFC-822 parsers.  You don't have
one in JavaScript, but you really do always have a JSON parser.  All you'd
need to do is read a line, split on the colon, and feed the right-hand side
to the JSON parser.

Existing RFC-822 parsers may implement unwanted features, like RFC-2047.
Python's "email" module supports things like "message/delivery-status",
which may expose behavior not wanted by WebVTT.  You'll never want to use a
stock parser; you'll need to implement your own, and WebVTT will want to
specify its own minimal subset of RFC-822 (it definitely couldn't just
reference RFC-822 and say "do what that says").  JSON doesn't have this
problem: you just read a line, split on the colon and feed the right-hand
side to a standard JSON parser.  Standard parsers and the existing specs
are all that's needed.

JSON allows editors to edit and import strings directly, with no changes to
the data.  The text you import is the text you save; everything (at least,
all valid Unicode text) round-trips.  With RFC-822, you need to insert
leading whitespace before continuation lines, so it has trouble maintaining
this property.

(Still, it's ugly that embedding a stylesheet would end up looking messy in
a plain text editor.  You don't really want to flatten a whole stylesheet
into one line.  We should be able to find an approach with none of the
negatives: clean in plain text, round-trips all text, while remaining
simple...)

-- 
Glenn Maynard

Received on Friday, 20 April 2012 04:09:59 UTC