Re: meta-data in the VTT file header, a strawman proposal

On Fri, Apr 20, 2012 at 10:59 PM, Silvia Pfeiffer <silviapfeiffer1@gmail.com
> wrote:

> > On Fri, Apr 20, 2012 at 12:01 AM, David Singer <singer@apple.com> wrote:
> >> Yes, I think we want some readability.  And maybe assuming all values
> >> could be on one line is bad.
> >
> > (I agree with the first, but the second isn't a problem: any JSON
> document
> > can be stored on one line.)
>
> I believe David wanted to make sure valued don't *have* to be on one
> line, which your proposal seemed to suggest?
>

I mean: it's perfectly safe to assume that all values *can* be stored on a
single line with JSON; the second (assuming all values *can* be on one
line) is safe.  Newlines within the string are encoded with \n (per JSON),
and there's no inherent limit to how long a line can be.

But it's the first that's the problem--it's ugly and hard to read with a
text editor, even though it would work.

 > Key: value
> >  value
> >   value2
> >
> > would decode to "value\nvalue\n value".
>
> Incidentally, multi-line in HTTP headers also require white space at the
> start.
>

Right, but it's one or more of any whitespace, so you can't tell how much
whitespace was there to begin with.  If it's exactly one space (like
patches), it doesn't have that problem.

I am not sure we can, though, without changing the parsing of WebVTT.
>

We could use a simple escape mechanism: if the first character on a header
value line begins with a backslash, remove it.  So, you get the following:

Key: |
line 1
\.
\
\\
.

In effect, "\." at the start of a line represents a period, "\" on a line
by itself represents a blank line, and '\\" at the start of a line is a
single backslash (escaping the escape character), but defined as a single
trivial rule instead of a list of escape sequences.

This means--unless I'm missing a case--that *any* valid block of UTF-8 text
will round-trip.  We don't strictly need that now (the only use case so far
for multiline comments is CSS), but it seems like a useful property to have
going forward.

Also, these are infrequent enough that they wouldn't uglify source very
much.

(One tangental detail: the final newline before the terminating "." line
should not be included in the resulting header data, or else it would be
impossible to encode a string that doesn't end with a newline.)

-- 
Glenn Maynard

Received on Saturday, 21 April 2012 04:40:07 UTC