- From: Simon Pieters <simonp@opera.com>
- Date: Thu, 06 Oct 2011 10:58:29 +0200
On Thu, 06 Oct 2011 01:45:13 +0200, Ralph Giles <giles at mozilla.com> wrote: > On 05/10/11 10:22 AM, Simon Pieters wrote: > >> I did some research on authoring errors in SRT timestamps to inform >> whether WebVTT parsing of timestamps should be changed. > > This is completely awesome, thanks for doing it. > >> hours too many '(^|\s|>)\d{3,}[:\.,]\d+[:\.,]\d+' >> 834 > > As Silvia mentioned, the WebVTT spec currently leaves the number of > digits in the hour field as implementation defined, so long as it's at > least two. > > I asked previously[1] if we could agree on and specify a limit. Would > you mind checking what the histogram of digit numbers is in the hours > field? Especially if you can separate cases like > >> 34500:24:01,000 --> 00:24:03,000 > > either because the index is missing, or because the the interval is > negative (for which the WebVTT spec would reject the entire cue). I don't know how many have negative interval, I'd need to run a new script over the 52,000,000 lines to figure out. (If you want me to check this, please contact me with details about what you want to count as "negative interval".) The cases where there were 3 or more digits in the hours field are distributed as follows: leading id e.g. 10300:11:53,891 --> 00:11:56,155 33 hours set to 255 (these seem to all come from the same file and the minutes are evenly distributed between 0 and 46; maybe the hours were actually intended to be 00) e.g. 255:46:18,058 --> 255:46:25,191 671 hours in the first timestamp much greater than the second timestamp e.g. 244:00:13,320 --> 00:00:13,320 10 hours in the second timestamp much greater than the first timestamp e.g. 00:00:33,010 --> 415:54:55,400 3 leading zero (in first and/or second timestamp) e.g. 000:09:40,300 --> 00:09:45,519 150 other (garbage) e.g. 8247,711,7nsuacer :56:20,0071:15 -->ddar vid18 9 > Cheers, > -r > > [1] > http://lists.whatwg.org/pipermail/whatwg-whatwg.org/2011-September/033271.html -- Simon Pieters Opera Software
Received on Thursday, 6 October 2011 01:58:29 UTC