- From: Silvia Pfeiffer <silviapfeiffer1@gmail.com>
- Date: Tue, 24 Apr 2012 10:46:45 +1000
- To: David Singer <singer@apple.com>
- Cc: Glenn Maynard <glenn@zewt.org>, public-texttracks@w3.org
On Tue, Apr 24, 2012 at 10:10 AM, David Singer <singer@apple.com> wrote: > On Apr 23, 2012, at 16:00 , Glenn Maynard wrote: > On Sat, Apr 21, 2012 at 1:10 AM, Silvia Pfeiffer <silviapfeiffer1@gmail.com> > wrote: >> >> Something like: >> >> 1. >> Name-vaue pairs of header metadata are given with a name-string >> separated from the value by a colon. >> No control characters or separators are allowed in the name value. >> No white space is allowed between the name and the colon (?). >> >> 2. >> If the value is a single "|" character, the value is multi-line, >> starting on the next line and ends with a line that only contains a >> single dot. >> The newline just before the dot-line is also not part of the value. >> >> >> A quick-and-dirty ABF could be: >> >> metadata-header = field-name ":" field-value >> field-name = token >> field-value = ("|" *TEXT CRLF "." CRLF) | (*TEXT without CRLF) > > > (I'm not sure; it looks more or less right, but reading ABFs has always > given me a headache.) > > Some other details: > > Presumably, whitespace between the colon and a single-line value would be > stripped, eg. > > Key: Value > > would result in "Key" = "Value". If you have significant leading whitespace > in the value you want to preserve, or if you need to encode the string "|" > itself, then switch to the block format: > > Key: | > Value > . > > Key: | > | > . > > > Yes. Then all we need to add is > "In multi-line values, a line that either (a) starts with the escape > character There is no escape character. I don't think we need one. > or (b) is blank (safer, visually blank) We can't do blank lines or we break the WEBVTT parsing algorithm. I think we will have to just accept that WebVTT headers can't have blank lines. > or (c) consists of the > termination sequence (a single period) must be escaped by having a "\" > pre-pended. On receipt, gather the lines up to the final terminator (".") > and remove all leading "\" characters. We haven't introduced an escape character at this stage. The only place where we'd need one is if we really needed a multi-line value with a "." on a single line. Is this case sufficiently likely to have to deal with it? Is there a way around it with a UTF-8 char? > If we want total flexibility, remove the line-break before the "." line, so > you can end without the line-end character if you want to (you can always > put it back explicitly with an escaped blank line). I think that's not so easy to parse and visually see as just "." on a line by itself. And it's easy to forget it at the end of a line, so I'd rather just have it there on a line by itself. Regards, Silvia.
Received on Tuesday, 24 April 2012 00:47:33 UTC