Re: Internet caption/subtitles ecosystem (was: [tt] Minutes of Timed Text TF teleconference, 11 July 2013)

On Thu, Jul 25, 2013 at 11:19 PM, Pierre-Anthony Lemieux
<pal@sandflow.com> wrote:
> Hi Silvia,
>
>> My paths are agnostic to where the subtitle files come from.
>
> I would think that the more we understand where/how the
> captions/subtitles were originally authored, the better we can
> understand how to best prepare them for Internet delivery. So the more
> information, the better, I would hazard.

Not sure it makes sense to go into details of how captions are
authored - there are too many ways. The main point is that the gating
point is the site that publishes the caption file with the video file
on a Web page. At that point in time, QA needs to be done, which is
independent of where the file came from and how it was authored. It
needs to be in sync with the video and it needs to not be faulty.


>> uploads a subtitle file for a video-identifier to CMS
>
> What about the case where caption/subtitle essence is ingested
> simultaneously with the audio and video essence, perhaps as a
> multiplex?

Right. We would need to extend the first path to also analyse whether
there is a caption track included and add that information to the
video upload & transcoding pipeline. I know from sites I have seen
that oftentimes they don't actually leave the caption track in the
video file, but rather extract it into a separate file in a format in
which the CMS manages its captions, because it is easier to manage
text files than multi-track audio/video files. So, there is likely a
need to add both possibilities.

It's starting to get a bit more messy. ;-)

Cheers,
Silvia.


> On Thu, Jul 25, 2013 at 2:15 PM, Silvia Pfeiffer
> <silviapfeiffer1@gmail.com> wrote:
>> My paths are agnostic to where the subtitle files come from. You can replace
>> "user" with any of content owner, subtitling company, professional
>> captioner, crowd sourced caption supplier or whatever other caption source
>> you can think of.
>>
>> They all have an ingest step and that step has to make sure that the
>> supplied file meets quality requirements. So you probably want to add a QA
>> step with the upload.
>>
>> Thanks,
>> Silvia.
>>
>> On 25 Jul 2013 19:20, "Pierre-Anthony Lemieux" <pal@sandflow.com> wrote:
>>>
>>> Hi Silvia,
>>>
>>> Thanks for the input.
>>>
>>> It looks like the use cases cover the case of user-supplied
>>> subtitles/captions. True? If so, what about subtitles/captions
>>> supplied by the content owner and/or ingested as part of the initial
>>> content upload?
>>>
>>> Thanks,
>>>
>>> Best,
>>>
>>> -- Pierre
>>>
>>> On Sat, Jul 13, 2013 at 9:31 AM, Silvia Pfeiffer
>>> <silviapfeiffer1@gmail.com> wrote:
>>> > Hi Pierre,
>>> >
>>> > I don't have time for the pretty drawings, but from what I can tell
>>> > e.g. from seeing YouTube and other online video publishers use
>>> > subtitles, it's not a difficult picture anyway.
>>> >
>>> > user
>>> >  |
>>> >     uploads video
>>> >  |
>>> >  v
>>> > video-identifier added to CMS
>>> >  |
>>> >    convert video file to all formats used by CMS
>>> >  |
>>> >  v
>>> > all [video-file, video-identifier] added to CMS
>>> >
>>> >
>>> > user (potentially the same or other)
>>> >  |
>>> >     uploads a subtitle file for a video-identifier to CMS
>>> >  |
>>> > v
>>> > [subtitle-identifier, video-identifier] added to CMS
>>> >  |
>>> >     convert subtitle file to all formats used by CMS
>>> >  |
>>> >  v
>>> > all [subtitle-file, subtitle-identifier, video-identifier] added to CMS
>>> >
>>> >
>>> > user (video viewer)
>>> >  |
>>> >     views page with video & subtitle formats as supported by their
>>> > player/browser
>>> >  |
>>> >     chooses particular subtitle track
>>> >  |
>>> > v
>>> > watches video with subtitles
>>> >
>>> >
>>> > HTH.
>>> >
>>> > Cheers,
>>> > Silvia.
>>> >
>>> >
>>> > On Sat, Jul 13, 2013 at 1:38 PM, Pierre-Anthony Lemieux
>>> > <pal@sandflow.com> wrote:
>>> >> Hi Silvia,
>>> >>
>>> >> As suggested during the call, the objective is to document the
>>> >> caption/subtitling ecosystem, using these diagrams as well as others
>>> >> as starting point.
>>> >>
>>> >> Looking forward to your input and participation.
>>> >>
>>> >> Best,
>>> >>
>>> >> -- Pierre
>>> >>
>>> >> On Fri, Jul 12, 2013 at 7:47 PM, Silvia Pfeiffer
>>> >> <silviapfeiffer1@gmail.com> wrote:
>>> >>> Thanks. No surprises there (except that WebVTT is completely missing
>>> >>> from the pictures).
>>> >>> I guess somebody needs to create a picture for how it works with
>>> >>> native Web content through Web CMSes.
>>> >>>
>>> >>> Silvia.
>>> >>>
>>> >>> On Sat, Jul 13, 2013 at 6:13 AM, Pierre-Anthony Lemieux
>>> >>> <pal@sandflow.com> wrote:
>>> >>>> Hi Silvia,
>>> >>>>
>>> >>>> Below are two that come to mind:
>>> >>>>
>>> >>>> http://tech.ebu.ch/ebu-tt
>>> >>>> https://www.smpte.org/publications/wallcharts
>>> >>>>
>>> >>>> Best,
>>> >>>>
>>> >>>> -- Pierre
>>> >>>>
>>> >>>> On Thu, Jul 11, 2013 at 3:47 PM, Silvia Pfeiffer
>>> >>>> <silviapfeiffer1@gmail.com> wrote:
>>> >>>>> Sorry I wasn't able to join this morning. Family commitments.
>>> >>>>>
>>> >>>>> Someone mentioned a link to the process in dock captions end up on
>>> >>>>> the web.
>>> >>>>> Could you share the link?
>>> >>>>>
>>> >>>>> Thanks!
>>> >>>>> Silvia.
>>> >>>>>
>>> >>>>> On 12 Jul 2013 07:57, "Daniel Davis" <ddavis@w3.org> wrote:
>>> >>>>>>
>>> >>>>>> Available at:
>>> >>>>>>  http://www.w3.org/2013/07/11-webtv-minutes.html
>>> >>>>>>
>>> >>>>>> Also as text below.
>>> >>>>>>
>>> >>>>>> Thank you to Kaz for the note-taking hand-holding and tidying up.
>>> >>>>>>
>>> >>>>>> Daniel
>>> >>>>>>
>>> >>>>>> ---
>>> >>>>>>
>>> >>>>>>    [1]W3C
>>> >>>>>>
>>> >>>>>>       [1] http://www.w3.org/
>>> >>>>>>
>>> >>>>>>                                - DRAFT -
>>> >>>>>>
>>> >>>>>>                    Web and TV IG - Timed Text TF Call
>>> >>>>>>
>>> >>>>>> 11 Jul 2013
>>> >>>>>>
>>> >>>>>>    [2]Agenda
>>> >>>>>>
>>> >>>>>>       [2] http://www.w3.org/2011/webtv/wiki/Tt/Timed_Text_Meeting_5
>>> >>>>>>
>>> >>>>>>    See also: [3]IRC log
>>> >>>>>>
>>> >>>>>>       [3] http://www.w3.org/2013/07/11-webtv-irc
>>> >>>>>>
>>> >>>>>> Attendees
>>> >>>>>>
>>> >>>>>>    Present
>>> >>>>>>           Kaz, Daniel, Graham, Pierre, Glenn, Mark, Mike
>>> >>>>>>           (Observer)
>>> >>>>>>
>>> >>>>>>    Regrets
>>> >>>>>>    Chair
>>> >>>>>>           Pierre
>>> >>>>>>
>>> >>>>>>    Scribe
>>> >>>>>>           ddavis
>>> >>>>>>
>>> >>>>>> Contents
>>> >>>>>>
>>> >>>>>>      * [4]Topics
>>> >>>>>>          1. [5]Daniel is joining
>>> >>>>>>          2. [6]Input to Timed Text WG charter revision process
>>> >>>>>>          3. [7]Improve the structure of Timed_Text_Efforts
>>> >>>>>>          4. [8]Ecosystem drawing
>>> >>>>>>          5. [9]Next call
>>> >>>>>>      * [10]Summary of Action Items
>>> >>>>>>      __________________________________________________________
>>> >>>>>>
>>> >>>>>>    <kaz> kaz: ok to invite Mike Dolan to this call?
>>> >>>>>>
>>> >>>>>>    <kaz> pal: ok
>>> >>>>>>
>>> >>>>>>    <kaz> (Mike's participation is approved)
>>> >>>>>>
>>> >>>>>> Daniel is joining
>>> >>>>>>
>>> >>>>>>    <kaz> ddavis: 50% Team Contact :)
>>> >>>>>>
>>> >>>>>>    <Mark_Vickers> Welcome, Daniel. Glad to have you with us!
>>> >>>>>>
>>> >>>>>>    Thank you!
>>> >>>>>>
>>> >>>>>> Input to Timed Text WG charter revision process
>>> >>>>>>
>>> >>>>>>    <pal>
>>> >>>>>>    [11]http://www.w3.org/2011/webtv/wiki/Tt/TTWG_Consensus_Input
>>> >>>>>>
>>> >>>>>>      [11] http://www.w3.org/2011/webtv/wiki/Tt/TTWG_Consensus_Input
>>> >>>>>>
>>> >>>>>>    <scribe> scribenick: ddavis
>>> >>>>>>
>>> >>>>>>    <pal> proposed additional text at
>>> >>>>>>    [12]http://www.w3.org/2011/webtv/wiki/Tt/Timed_Text_Meeting_5#P
>>> >>>>>>    roposed_Agenda
>>> >>>>>>
>>> >>>>>>      [12]
>>> >>>>>>
>>> >>>>>> http://www.w3.org/2011/webtv/wiki/Tt/Timed_Text_Meeting_5#Proposed_Agenda
>>> >>>>>>
>>> >>>>>>    pal: I'm going to add this proposed text to the Consensus Input
>>> >>>>>>    now
>>> >>>>>>    ... The change is now added (as point 5)
>>> >>>>>>    ... Something that was not perhaps clear is that the Web & TV
>>> >>>>>>    IG was supporting the addition of WebVTT deliverables to the
>>> >>>>>>    TTWG deliverables
>>> >>>>>>    ... so maybe there should be a sentence added to that effect.
>>> >>>>>>    ... Showing that we support WebVTT as part of the deliverables
>>> >>>>>>    for the Timed Text WG.
>>> >>>>>>
>>> >>>>>>    <kaz> Mark_Vickers: good place to do that
>>> >>>>>>
>>> >>>>>>    pal: Again, I'm going to make this addition to the Consensus
>>> >>>>>>    Input now.
>>> >>>>>>    ... Has now been added to the paragraph before the numbered
>>> >>>>>>    items.
>>> >>>>>>    ... If there are no objections, I'll post this back to the
>>> >>>>>>    Interest Group reflector
>>> >>>>>>    ... and hopefully the IG can then forward this to W3C Team.
>>> >>>>>>    ... Thank you - I think this closes one of our work items.
>>> >>>>>>
>>> >>>>>>    <pal> next agenda topic:
>>> >>>>>>    [13]http://www.w3.org/2011/webtv/wiki/Tt/Timed_Text_Efforts
>>> >>>>>>
>>> >>>>>>      [13] http://www.w3.org/2011/webtv/wiki/Tt/Timed_Text_Efforts
>>> >>>>>>
>>> >>>>>>    <scribe> ACTION: pal to post the Consensus Input to the
>>> >>>>>>    Interest Group reflector [recorded in
>>> >>>>>>    [14]http://www.w3.org/2013/07/11-webtv-minutes.html#action01]
>>> >>>>>>
>>> >>>>>>    <trackbot> Created ACTION-121 - Post the Consensus Input to the
>>> >>>>>>    Interest Group reflector [on Pierre-Anthony Lemieux - due
>>> >>>>>>    2013-07-18].
>>> >>>>>>
>>> >>>>>> Improve the structure of Timed_Text_Efforts
>>> >>>>>>
>>> >>>>>>    pal: There has been the suggestion that this wiki page could be
>>> >>>>>>    better organised.
>>> >>>>>>
>>> >>>>>>    Mark_Vickers: I have some ideas and would be happy to work with
>>> >>>>>>    you offline
>>> >>>>>>    ... to update this page.
>>> >>>>>>
>>> >>>>>>    mike: It seems there's a tangle in this process.
>>> >>>>>>
>>> >>>>>>    pal: Only members can edit the wiki.
>>> >>>>>>
>>> >>>>>>    Mark_Vickers: I'll push to get that resolved so that I can edit
>>> >>>>>>    it.
>>> >>>>>>
>>> >>>>>>    mike: We have some private email with ideas that could really
>>> >>>>>>    improve this page.
>>> >>>>>>
>>> >>>>>>    pal: Thanks - I'm happy to help as well.
>>> >>>>>>
>>> >>>>>> Ecosystem drawing
>>> >>>>>>
>>> >>>>>>    pal: Next item is "ecosystem drawing"
>>> >>>>>>    ... How do all those captions out there end up on the web, etc?
>>> >>>>>>    ... I put up a link to an overview of the caption workflow that
>>> >>>>>>    EBU and others use.
>>> >>>>>>    ... It might be good for this group to create a similar
>>> >>>>>>    illustration of how timed text is created and ultimately put on
>>> >>>>>>    the web.
>>> >>>>>>    ... I'm spending time on it but if someone else has a strong
>>> >>>>>>    opinion on it, I'm happy.
>>> >>>>>>    ... I'd like to hear other opinions - is it needed? Is it
>>> >>>>>>    needed here? Is it needed urgently?
>>> >>>>>>
>>> >>>>>>    mark_vickers: I think it's useful to have to shed light on some
>>> >>>>>>    of the difficulties we're facing
>>> >>>>>>    ... but I think people in the TT WG specifically are probably
>>> >>>>>>    aware of this already.
>>> >>>>>>
>>> >>>>>>    pal: Any other comments about this?
>>> >>>>>>    ... No? OK. Any other new topics?
>>> >>>>>>
>>> >>>>>> Next call
>>> >>>>>>
>>> >>>>>>    pal: Hearing none and having gone through the agenda, let's
>>> >>>>>>    choose a date for our next meeting.
>>> >>>>>>    ... I think 2pm is the best compromise. How about August 1st?
>>> >>>>>>
>>> >>>>>>    <kaz> (that is August 2, 6am in Japan)
>>> >>>>>>
>>> >>>>>>    pal: This timeslot was made to try to accommodate everyone
>>> >>>>>>    including Sylvia in Australia, so I think we should give this
>>> >>>>>>    timeslot another try.
>>> >>>>>>    ... so let's try August 1st, 2pm Los Angeles time.
>>> >>>>>>    ... Thank you every - look forward to speaking to you again in
>>> >>>>>>    2 weeks.
>>> >>>>>>
>>> >>>>>>    <kaz> [ adjourned ]
>>> >>>>>>
>>> >>>>>> Summary of Action Items
>>> >>>>>>
>>> >>>>>>    [NEW] ACTION: pal to post the Consensus Input to the Interest
>>> >>>>>>    Group reflector [recorded in
>>> >>>>>>    [15]http://www.w3.org/2013/07/11-webtv-minutes.html#action01]
>>> >>>>>>
>>> >>>>>>    [End of minutes]
>>> >>>>>>      __________________________________________________________
>>> >>>>>>
>>> >>>>>>
>>> >>>>>>     Minutes formatted by David Booth's [16]scribe.perl version
>>> >>>>>>     1.138 ([17]CVS log)
>>> >>>>>>     $Date: 2013-07-11 21:46:40 $
>>> >>>>>>
>>> >>>>>>      [16]
>>> >>>>>> http://dev.w3.org/cvsweb/~checkout~/2002/scribe/scribedoc.htm
>>> >>>>>>      [17] http://dev.w3.org/cvsweb/2002/scribe/
>>> >>>>>>
>>> >>>>>>
>>> >>>>>>
>>> >>>>>>
>>> >>>>>

Received on Friday, 26 July 2013 02:08:23 UTC