W3C home > Mailing lists > Public > public-tt@w3.org > December 2008

Re: caption/subtitle discussion on ogg accessibility list

From: Silvia Pfeiffer <silviapfeiffer1@gmail.com>
Date: Wed, 10 Dec 2008 03:38:32 +1100
Message-ID: <2c0e02830812090838q707b33efp502e06c90ebf3046@mail.gmail.com>
To: "Sean Hayes" <Sean.Hayes@microsoft.com>
Cc: "Geoff Freed" <geoff_freed@wgbh.org>, "public-tt@w3.org" <public-tt@w3.org>

Let me clarify some misunderstandings.

On Wed, Dec 10, 2008 at 3:05 AM, Sean Hayes <Sean.Hayes@microsoft.com> wrote:
> I think the idea of a single, or even a group of baseline codecs for HTML5 is not going to fly; and frankly myself I don't see what is wrong with the <object><param> model in HTML 4. More typically I think we are going to continue to be embedding a player codec in HTML using a plug in model, rather than a video object directly, and in that model the parameters might also include multiple video sources (including overlays for signing), playlists and multiple soundtracks.

HTML5 already has a <video> element.

> The params would be a private contract for each player type, but an example might be:
> <object type="application/x-ogg-player" lang="en" ... >
>    <param name="video-en" type=" video/ogg " valuetype="ref" value="video.en.ogg" />
>    <param name="captions" type="application/ttaf+xml" valuetype="ref" value="caption.dfxp" />
>    <param name="subtitle-en" type="application/ttaf+xml" valuetype="ref" value="subtitle.en.dfxp" />
>    <param name="subtitle-jp" type="application/ttaf+xml" valuetype="ref" value="subtitle.jp.dfxp" />
>    <param name="subtitle-fr" type="application/ttaf+xml" valuetype="ref" value="subtitle.fr.dfxp" />
>    <param name="subtitle-de" type="application/ttaf+xml" valuetype="ref" value="subtitle.de.dfxp" />
> </object>

That was basically the principle that I proposed. Is now in discussion
at WHATWG, so we will see. :-)

> I've used dfxp exclusively here, but obviously ogg is free to substitute whatever you need, just as Silverlight, Flash, Quicktime, Realplayer etc do; and there is no need to build anything new into the HTML spec in order to do so.
> For reasons we have debated on this list, I think an implied switch using lang is a wrong model. Embedding text within a <video> tag
> also seems wrong as that would imply to me that the <text> is an alternate to, rather than an adjunct to the video.

I agree with an implied switch being the wrong model *inside* DFXP.
However, this is a switch model *between* different DFXP files, so
completely in agreement with our previous discussions.

> As for an interaction model, I think that is leading you headlong into a clash with SMIL; which is not somewhere I think Ogg or HTML 5
> should go; and is a need certainly not served by DFXP -- by design. If you do decide to go that route, I would recommend a cleanup of
> the HTML+TIME spec based on SMIL3 might be a better starting point than DFXP.

In no way shape or form did I mean to imply the creation of a new
interaction model. That would be just wrong, when SMIL already
satisfies that space perfectly. Sorry if that came across the wrong

Best Regards,

> Sean Hayes
> Media Accessibility Strategist
> Accessibility Business Unit
> Microsoft
> Office:  +44 118 909 5867,
> Mobile: +44 7875 091385
> -----Original Message-----
> From: Silvia Pfeiffer [mailto:silviapfeiffer1@gmail.com]
> Sent: 09 December 2008 00:13
> To: Sean Hayes; Geoff Freed
> Cc: public-tt@w3.org
> Subject: Re: caption/subtitle discussion on ogg accessibility list
> Let me clarify what is happening at Ogg in more detail..
> The discussions about Ogg and accessibility are motivated by the use
> of Ogg Theora/Vorbis as a baseline codec in Mozilla/Firefox for HTML5
> video tag support.
> Mozilla is investigating how to get support for subtitles and other
> types of time-aligned text (such as speech bubbles, karaoke,
> hyperlinked text annotations and the like) into the Web browser.
> It has been determined that there is a need for two approaches:
> 1) An out-of-band approach:
> In HTML5, the video resource and the text resource would be linked
> separately through the <video tag>. The links to an external text
> resource would need to be accepted by the Web browser as a
> time-aligned text format for a video and used on the fly. This can
> look something like this:
> <video src="http://example.com/video.ogv" controls>
>  <text category="CC" lang="en" type="text/x-srt" src="caption.srt"></text>
>  <text category="SUB" lang="de" type="application/ttaf+xml"
> src="german.dfxp"></text>
>  <text category="SUB" lang="jp" type="application/smil"
> src="japanese.smil"></text>
>  <text category="SUB" lang="fr" type="text/x-srt"
> src="translation_webservice/fr/caption.srt"></text>
> </video>
> NOTE that this is a proposal, unimplemented, and not yet discussed by
> HTML5. But it is an idea we are toying with at Ogg accessibility.
> 2) An in-band approach:
> The delivery of time-aligned text would be multiplexed together with
> the video file inside the Ogg stream. This will then allow the Web
> browser to extract the text upon decoding. It will not change anything
> in the current version of the HTML5 video tag:
> <video src="http://example.com/video.ogv" controls>
> </video>
> For this second case, we are discussing means of including
> time-aligned text (or what we call "text codecs") into the Ogg
> bitstream. Which is where Geoff's concerns come in.
> Currently, we have defined a generic mapping for any type of
> time-aligned text into Ogg by defining OggText.
> http://wiki.xiph.org/index.php/OggText
> This generic mapping can in principle take DFXP or srt or CMML or kate
> or SMIL or any other format. Mapping of a specific format requires
> some further small specification on top of OggText.
> Currently we have started with the simplest mapping, which is OggSRT.
> SRT and srt-like formats (like SUB) are simple in that they are plain
> text and a time segment and most media players can deal with them.
> Also, a large number of available subtitles and captions online are
> being provided in these formats. Also, YouTube supports them, which
> will further encourage people to provide more of these.
> To get a quick and effective result for Mozilla and their needs for
> subtitles, srt is the most sensible choice.
> This does in no way shape or form inhibit DFXP from getting supported
> inside Ogg. It's just simply not first implementation priority. Also,
> I am under the impresison that through the public-tt work, DFXP may
> still see some changes in the near future and I am looking forward to
> the final format, which will provide more powerful time-aligned text
> capabilities to Web browsers. Most subtitle needs can be fulfilled
> with srt, but there are other needs, which DFXP will satisfy.
> Just to mention this, too: there are further needs that we have
> identified, that DFXP currently cannot satisfy IIUC - such as outgoing
> hyperlinks for a piece of text, or regions that when you mouse-over
> make another text region appear. I may be mistaken with these though
> and would be curious to find out how such requirements could be
> satisfied with DFXP.
> Best Regards,
> Silvia.
> On Tue, Dec 9, 2008 at 5:02 AM, Sean Hayes <Sean.Hayes@microsoft.com> wrote:
>> I suspect the Ogg group will go their own way, and while it is disappointing
>> they would not pick up dfxp directly I can understand their reasoning; and
>> this is really not much different to any other proprietary codec.
>> The primary point about DFXP for me is in its use as a clearing house
>> between production and delivery, and as long as there is a dfxp<-->ogg
>> translation I see no problem in them using whatever they want for end user
>> delivery (although the subrip text format does seem overly basic, it's
>> really not that different from 3gpp or 608.). I'd love the world to
>> standardise on a single delivery format, but I'm realistic that that is not
>> on the cards any time soon, it being too easy to just whip up another
>> time+string format without really considering generality, users needs, IP
>> protection, internationalisation etc, etc.
>> The way to get to the ideal point is to start at the production and b2b end.
>> The key here is a common origination format, once that is established, and
>> then when mainstream proprietary players to consume and display it with full
>> fidelity; then we can start to think about a one size fits all solution
>> based on dfxp or some successor.
>> Sean Hayes
>> Media Accessibility Strategist
>> Accessibility Business Unit
>> Microsoft
>> Office:  +44 118 909 5867,
>> Mobile: +44 7875 091385
>> From: public-tt-request@w3.org [mailto:public-tt-request@w3.org] On Behalf
>> Of Geoff Freed
>> Sent: 08 December 2008 14:09
>> To: public-tt@w3.org
>> Subject: caption/subtitle discussion on ogg accessibility list
>> there's a lengthy discussion about captions/subtitles going on at the ogg
>> accessibility list.  archives are available at
>> http://lists.xiph.org/pipermail/accessibility/, or you can sign up at
>> http://lists.xiph.org/mailman/listinfo/accessibility and join in.
>> there has been some debate over what text-display format to support
>> initially, and the group seems headed toward support of SubRip (srt).  i've
>> expressed concern that doing so might initially limit the usefulness of ogg
>> captions/subtitles, and have lobbied for the inclusion of dfxp from the
>> beginning, rather than waiting until after srt support has been established.
>>  you can see my comments in the archives.
>> g.
Received on Tuesday, 9 December 2008 16:39:16 UTC

This archive was generated by hypermail 2.3.1 : Thursday, 5 October 2017 18:24:03 UTC