W3C home > Mailing lists > Public > whatwg@whatwg.org > October 2011

[whatwg] SRT research: timestamps

From: Simon Pieters <simonp@opera.com>
Date: Thu, 06 Oct 2011 10:58:29 +0200
Message-ID: <op.v2w9vrq3idj3kv@simon-pieterss-macbook.local>
On Thu, 06 Oct 2011 01:45:13 +0200, Ralph Giles <giles at mozilla.com> wrote:

> On 05/10/11 10:22 AM, Simon Pieters wrote:
>
>> I did some research on authoring errors in SRT timestamps to inform
>> whether WebVTT parsing of timestamps should be changed.
>
> This is completely awesome, thanks for doing it.
>
>> hours too many '(^|\s|>)\d{3,}[:\.,]\d+[:\.,]\d+'
>> 834
>
> As Silvia mentioned, the WebVTT spec currently leaves the number of
> digits in the hour field as implementation defined, so long as it's at
> least two.
>
> I asked previously[1] if we could agree on and specify a limit. Would
> you mind checking what the histogram of digit numbers is in the hours
> field? Especially if you can separate cases like
>
>> 34500:24:01,000 --> 00:24:03,000
>
> either because the index is missing, or because the the interval is
> negative (for which the WebVTT spec would reject the entire cue).

I don't know how many have negative interval, I'd need to run a new script  
over the 52,000,000 lines to figure out. (If you want me to check this,  
please contact me with details about what you want to count as "negative  
interval".)

The cases where there were 3 or more digits in the hours field are  
distributed as follows:

leading id e.g.
10300:11:53,891 --> 00:11:56,155

33

hours set to 255 (these seem to all come from the same file and the  
minutes are evenly distributed between 0 and 46; maybe the hours were  
actually intended to be 00) e.g.
255:46:18,058 --> 255:46:25,191

671

hours in the first timestamp much greater than the second timestamp e.g.
244:00:13,320 --> 00:00:13,320

10

hours in the second timestamp much greater than the first timestamp e.g.
00:00:33,010 --> 415:54:55,400

3

leading zero (in first and/or second timestamp) e.g.
000:09:40,300 --> 00:09:45,519

150

other (garbage) e.g.
8247,711,7nsuacer :56:20,0071:15 -->ddar vid18

9

> Cheers,
>  -r
>
> [1]
> http://lists.whatwg.org/pipermail/whatwg-whatwg.org/2011-September/033271.html


-- 
Simon Pieters
Opera Software
Received on Thursday, 6 October 2011 01:58:29 UTC

This archive was generated by hypermail 2.4.0 : Wednesday, 22 January 2020 16:59:37 UTC