Removing cues with duplicate ids as a way to allow cue rewriting

I've proposed this in the past, but I think I've narrowed it down to 
specific changes, and I'm wondering if other people think this would be 
a useful way to allow rewriting of cues.

*Why*

In streaming text tracks, we need a way to fix incorrect cues. Some 
examples:

  * In live TV, people type the captions in by hand
    <http://en.wikipedia.org/wiki/Closed_captioning#Television_and_video> shortly
    before you see them. If they make a mistake, we need a way to fix it.
  * CEA-608 and CEA-708 captions don't start with a convenient startTime
    --> endTime block like WebVTT does. A caption ends when we get a
    command that makes it stop displaying. If we want to transcode to
    WebVTT in real-time, we have to either wait until the caption is
    over to translate it (delaying the stream by some arbitary time in
    the hope that it will be long enough), or we need to start a caption
    immediately with a guess of the end time and then rewrite it once we
    know the correct end time (or rewrite it to extend the end time
    until we find the correct one).

*How*

The solution I'm proposing is that if we see two cues with the same id, 
the earlier cue will be removed.

    some-id
    00:00:00 --> 00:00:30
    This is an xeample

    some-id
    00:00:00 --> 00:00:10
    This is an example

In this example, the text "This is an example" will be displayed for 10 
seconds starting at time 0.

*Why This Solution*

This solution is nice because the syntax is simple and easy to 
understand, and it's powerful enough to rewrite any cue in any way you 
could possibly want, because the new cue completely replaces the old one.

*Arguments against*

This isn't particularly efficient. If you just want to change the time, 
you need to send the entire updated cue, instead of just the change.

I don't think this is a big deal, because even the most heavily edited 
subtitle file will be orders of magnitude smaller than the accompanying 
video.

*Specifically..*

There are a couple ways of doing this in HTML:

 1. Do this at the WebVTT-layer: If the WebVTT parser sees a cue with
    the same id as an older cue, it explicitly removes the older cue
    from the track and then adds the new cue.
 2. Do this at the HTML layer: If any TextTrack gets a cue with the same
    id as a cue it already has, it removes the old cue before adding the
    new one.

I think doing this at the HTML layer makes sense, because:

  * ids should be unique anyway. That's why they're ids.
  * TextTrack.getCueById() doesn't make sense if ids aren't unique.
  * The implementation is much easier if we can throw a hash table in
    TextTrack and use it for detecting duplicate ids, and for making
    getCueById() fast.
  * Handling weird edge cases is simpler:
      o If JavaScript adds a cue with the same id as an existing cue,
        the existing cue is removed.
      o If the UA adds a cue with the same id as a cue added by
        JavaScript, the cue added by JavaScript is removed (in this
        case, presumably whatever trigger caused JavaScript to add that
        cue will be triggered again).

I think if we do this, it should *also* be added to the WebVTT spec, so 
files that do this will render properly in non-HTML media players.

*More specifically**...*

https://github.com/w3c/html/pull/20

*Conclusion*

Does this seem like a reasonable change to you?



--------------090908040101020106080809
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit

<html>
  <head>

    <meta http-equiv="content-type" content="text/html; charset=ISO-8859-1">
  </head>
  <body text="#000000" bgcolor="#FFFFFF">
    I've proposed this in the past, but I think I've narrowed it down to
    specific changes, and I'm wondering if other people think this would
    be a useful way to allow rewriting of cues.<br>
    <br>
    <b>Why</b><br>
    <br>
    In streaming text tracks, we need a way to fix incorrect cues. Some
    examples:<br>
    <ul>
      <li>In live TV, <a
href="http://en.wikipedia.org/wiki/Closed_captioning#Television_and_video">people
          type the captions in by hand</a> shortly before you see them.
        If they make a mistake, we need a way to fix it.</li>
      <li>CEA-608 and CEA-708 captions don't start with a convenient
        startTime --&gt; endTime block like WebVTT does. A caption ends
        when we get a command that makes it stop displaying. If we want
        to transcode to WebVTT in real-time, we have to either wait
        until the caption is over to translate it (delaying the stream
        by some arbitary time in the hope that it will be long enough),
        or we need to start a caption immediately with a guess of the
        end time and then rewrite it once we know the correct end time
        (or rewrite it to extend the end time until we find the correct
        one).</li>
    </ul>
    <p><b>How</b><br>
    </p>
    <p>The solution I'm proposing is that if we see two cues with the
      same id, the earlier cue will be removed.<br>
    </p>
    <blockquote>
      <p>some-id<br>
        00:00:00 --&gt; 00:00:30<br>
        This is an xeample<br>
      </p>
      <p>some-id<br>
        00:00:00 --&gt; 00:00:10<br>
        This is an example<br>
      </p>
    </blockquote>
    <p>In this example, the text "This is an example" will be displayed
      for 10 seconds starting at time 0.<br>
    </p>
    <p><b>Why This Solution</b><br>
    </p>
    <p>This solution is nice because the syntax is simple and easy to
      understand, and it's powerful enough to rewrite any cue in any way
      you could possibly want, because the new cue completely replaces
      the old one.<br>
    </p>
    <p><b>Arguments against</b><br>
    </p>
    <p>This isn't particularly efficient. If you just want to change the
      time, you need to send the entire updated cue, instead of just the
      change.<br>
    </p>
    <p>I don't think this is a big deal, because even the most heavily
      edited subtitle file will be orders of magnitude smaller than the
      accompanying video.<br>
    </p>
    <p><b>Specifically..</b><br>
    </p>
    <p>There are a couple ways of doing this in HTML:<br>
    </p>
    <ol>
      <li>Do this at the WebVTT-layer: If the WebVTT parser sees a cue
        with the same id as an older cue, it explicitly removes the
        older cue from the track and then adds the new cue.</li>
      <li>Do this at the HTML layer: If any TextTrack gets a cue with
        the same id as a cue it already has, it removes the old cue
        before adding the new one.</li>
    </ol>
    <p>I think doing this at the HTML layer makes sense, because:<br>
    </p>
    <ul>
      <li>ids should be unique anyway. That's why they're ids.</li>
      <li>TextTrack.getCueById() doesn't make sense if ids aren't
        unique.</li>
      <li>The implementation is much easier if we can throw a hash table
        in TextTrack and use it for detecting duplicate ids, and for
        making getCueById() fast.</li>
      <li>Handling weird edge cases is simpler:</li>
      <ul>
        <li>If JavaScript adds a cue with the same id as an existing
          cue, the existing cue is removed.</li>
        <li>If the UA adds a cue with the same id as a cue added by
          JavaScript, the cue added by JavaScript is removed (in this
          case, presumably whatever trigger caused JavaScript to add
          that cue will be triggered again).</li>
      </ul>
    </ul>
    <p>I think if we do this, it should *also* be added to the WebVTT
      spec, so files that do this will render properly in non-HTML media
      players.<br>
    </p>
    <p><b>More specifically</b><b>...</b><br>
    </p>
    <p><a href="https://github.com/w3c/html/pull/20">https://github.com/w3c/html/pull/20</a><br>
    </p>
    <p><b>Conclusion</b><br>
    </p>
    <p>Does this seem like a reasonable change to you?<br>
    </p>
    <p><br>
    </p>
  </body>
</html>

--------------090908040101020106080809--

Received on Thursday, 22 May 2014 23:18:43 UTC