W3C home > Mailing lists > Public > public-tt@w3.org > December 2008

Re: caption/subtitle discussion on ogg accessibility list

From: Silvia Pfeiffer <silviapfeiffer1@gmail.com>
Date: Wed, 10 Dec 2008 03:52:52 +1100
Message-ID: <2c0e02830812090852h61aac096rad71331e426c490d@mail.gmail.com>
To: "Sean Hayes" <Sean.Hayes@microsoft.com>
Cc: "Geoff Freed" <geoff_freed@wgbh.org>, "public-tt@w3.org" <public-tt@w3.org>

On Wed, Dec 10, 2008 at 3:47 AM, Sean Hayes <Sean.Hayes@microsoft.com> wrote:
>> HTML5 already has a <video> element.
>
> HTML5 is far from finished drafting, let alone reaching market acceptance <grin/>

...which is being rolled out by Firefox and Opera at least. :-)


>>That was basically the principle that I proposed. Is now in discussion
>>at WHATWG, so we will see. :-)
>
> If it's a private contract approach rather than a baked in usage model, then it would have my endorsement; however, I do not speak for
> the DFXP group, or even Microsoft in that regard.

The decoding indeed depends on the capabilities of the media player,
so it is indeed a kind of "private contract".


>>However, this is a switch model *between* different DFXP files, so
>>completely in agreement with our previous discussions.
>
> It's still an *implied* switch, which I think is wrong as part of a language design regardless of level. As part of a private contract it would be fine.

The <video> element already has such an implied switch for the
different video codecs it can support. I don't see this as any
different.

Cheers,
Silvia.



>>In no way shape or form did I mean to imply the creation of a new
>>interaction model. That would be just wrong, when SMIL already
>>satisfies that space perfectly. Sorry if that came across the wrong
>>way.
>
> Excellent to hear that, you did have me worried. I think it would be a mistake to try and serve to many purposes.
>
> Cheers,
>
> Sean Hayes
> Media Accessibility Strategist
> Accessibility Business Unit
> Microsoft
>
> Office:  +44 118 909 5867,
> Mobile: +44 7875 091385
>
>
> -----Original Message-----
> From: Silvia Pfeiffer [mailto:silviapfeiffer1@gmail.com]
> Sent: 09 December 2008 16:39
> To: Sean Hayes
> Cc: Geoff Freed; public-tt@w3.org
> Subject: Re: caption/subtitle discussion on ogg accessibility list
>
> Let me clarify some misunderstandings.
>
> On Wed, Dec 10, 2008 at 3:05 AM, Sean Hayes <Sean.Hayes@microsoft.com> wrote:
>> I think the idea of a single, or even a group of baseline codecs for HTML5 is not going to fly; and frankly myself I don't see what is wrong with the <object><param> model in HTML 4. More typically I think we are going to continue to be embedding a player codec in HTML using a plug in model, rather than a video object directly, and in that model the parameters might also include multiple video sources (including overlays for signing), playlists and multiple soundtracks.
>
> HTML5 already has a <video> element.
>
>
>> The params would be a private contract for each player type, but an example might be:
>>
>> <object type="application/x-ogg-player" lang="en" ... >
>>    <param name="video-en" type=" video/ogg " valuetype="ref" value="video.en.ogg" />
>>    <param name="captions" type="application/ttaf+xml" valuetype="ref" value="caption.dfxp" />
>>    <param name="subtitle-en" type="application/ttaf+xml" valuetype="ref" value="subtitle.en.dfxp" />
>>    <param name="subtitle-jp" type="application/ttaf+xml" valuetype="ref" value="subtitle.jp.dfxp" />
>>    <param name="subtitle-fr" type="application/ttaf+xml" valuetype="ref" value="subtitle.fr.dfxp" />
>>    <param name="subtitle-de" type="application/ttaf+xml" valuetype="ref" value="subtitle.de.dfxp" />
>> </object>
>
> That was basically the principle that I proposed. Is now in discussion
> at WHATWG, so we will see. :-)
>
>
>> I've used dfxp exclusively here, but obviously ogg is free to substitute whatever you need, just as Silverlight, Flash, Quicktime, Realplayer etc do; and there is no need to build anything new into the HTML spec in order to do so.
>>
>> For reasons we have debated on this list, I think an implied switch using lang is a wrong model. Embedding text within a <video> tag
>> also seems wrong as that would imply to me that the <text> is an alternate to, rather than an adjunct to the video.
>
> I agree with an implied switch being the wrong model *inside* DFXP.
> However, this is a switch model *between* different DFXP files, so
> completely in agreement with our previous discussions.
>
>
>> As for an interaction model, I think that is leading you headlong into a clash with SMIL; which is not somewhere I think Ogg or HTML 5
>> should go; and is a need certainly not served by DFXP -- by design. If you do decide to go that route, I would recommend a cleanup of
>> the HTML+TIME spec based on SMIL3 might be a better starting point than DFXP.
>
> In no way shape or form did I mean to imply the creation of a new
> interaction model. That would be just wrong, when SMIL already
> satisfies that space perfectly. Sorry if that came across the wrong
> way.
>
>
> Best Regards,
> Silvia.
>
>>
>> Sean Hayes
>> Media Accessibility Strategist
>> Accessibility Business Unit
>> Microsoft
>>
>> Office:  +44 118 909 5867,
>> Mobile: +44 7875 091385
>>
>>
>> -----Original Message-----
>> From: Silvia Pfeiffer [mailto:silviapfeiffer1@gmail.com]
>> Sent: 09 December 2008 00:13
>> To: Sean Hayes; Geoff Freed
>> Cc: public-tt@w3.org
>> Subject: Re: caption/subtitle discussion on ogg accessibility list
>>
>> Let me clarify what is happening at Ogg in more detail..
>>
>> The discussions about Ogg and accessibility are motivated by the use
>> of Ogg Theora/Vorbis as a baseline codec in Mozilla/Firefox for HTML5
>> video tag support.
>>
>> Mozilla is investigating how to get support for subtitles and other
>> types of time-aligned text (such as speech bubbles, karaoke,
>> hyperlinked text annotations and the like) into the Web browser.
>>
>> It has been determined that there is a need for two approaches:
>>
>> 1) An out-of-band approach:
>> In HTML5, the video resource and the text resource would be linked
>> separately through the <video tag>. The links to an external text
>> resource would need to be accepted by the Web browser as a
>> time-aligned text format for a video and used on the fly. This can
>> look something like this:
>>
>> <video src="http://example.com/video.ogv" controls>
>>  <text category="CC" lang="en" type="text/x-srt" src="caption.srt"></text>
>>  <text category="SUB" lang="de" type="application/ttaf+xml"
>> src="german.dfxp"></text>
>>  <text category="SUB" lang="jp" type="application/smil"
>> src="japanese.smil"></text>
>>  <text category="SUB" lang="fr" type="text/x-srt"
>> src="translation_webservice/fr/caption.srt"></text>
>> </video>
>>
>> NOTE that this is a proposal, unimplemented, and not yet discussed by
>> HTML5. But it is an idea we are toying with at Ogg accessibility.
>>
>> 2) An in-band approach:
>> The delivery of time-aligned text would be multiplexed together with
>> the video file inside the Ogg stream. This will then allow the Web
>> browser to extract the text upon decoding. It will not change anything
>> in the current version of the HTML5 video tag:
>>
>> <video src="http://example.com/video.ogv" controls>
>> </video>
>>
>> For this second case, we are discussing means of including
>> time-aligned text (or what we call "text codecs") into the Ogg
>> bitstream. Which is where Geoff's concerns come in.
>>
>> Currently, we have defined a generic mapping for any type of
>> time-aligned text into Ogg by defining OggText.
>> http://wiki.xiph.org/index.php/OggText
>> This generic mapping can in principle take DFXP or srt or CMML or kate
>> or SMIL or any other format. Mapping of a specific format requires
>> some further small specification on top of OggText.
>>
>> Currently we have started with the simplest mapping, which is OggSRT.
>> SRT and srt-like formats (like SUB) are simple in that they are plain
>> text and a time segment and most media players can deal with them.
>> Also, a large number of available subtitles and captions online are
>> being provided in these formats. Also, YouTube supports them, which
>> will further encourage people to provide more of these.
>>
>> To get a quick and effective result for Mozilla and their needs for
>> subtitles, srt is the most sensible choice.
>>
>> This does in no way shape or form inhibit DFXP from getting supported
>> inside Ogg. It's just simply not first implementation priority. Also,
>> I am under the impresison that through the public-tt work, DFXP may
>> still see some changes in the near future and I am looking forward to
>> the final format, which will provide more powerful time-aligned text
>> capabilities to Web browsers. Most subtitle needs can be fulfilled
>> with srt, but there are other needs, which DFXP will satisfy.
>>
>> Just to mention this, too: there are further needs that we have
>> identified, that DFXP currently cannot satisfy IIUC - such as outgoing
>> hyperlinks for a piece of text, or regions that when you mouse-over
>> make another text region appear. I may be mistaken with these though
>> and would be curious to find out how such requirements could be
>> satisfied with DFXP.
>>
>> Best Regards,
>> Silvia.
>>
>>
>> On Tue, Dec 9, 2008 at 5:02 AM, Sean Hayes <Sean.Hayes@microsoft.com> wrote:
>>> I suspect the Ogg group will go their own way, and while it is disappointing
>>> they would not pick up dfxp directly I can understand their reasoning; and
>>> this is really not much different to any other proprietary codec.
>>>
>>>
>>>
>>> The primary point about DFXP for me is in its use as a clearing house
>>> between production and delivery, and as long as there is a dfxp<-->ogg
>>> translation I see no problem in them using whatever they want for end user
>>> delivery (although the subrip text format does seem overly basic, it's
>>> really not that different from 3gpp or 608.). I'd love the world to
>>> standardise on a single delivery format, but I'm realistic that that is not
>>> on the cards any time soon, it being too easy to just whip up another
>>> time+string format without really considering generality, users needs, IP
>>> protection, internationalisation etc, etc.
>>>
>>>
>>>
>>> The way to get to the ideal point is to start at the production and b2b end.
>>> The key here is a common origination format, once that is established, and
>>> then when mainstream proprietary players to consume and display it with full
>>> fidelity; then we can start to think about a one size fits all solution
>>> based on dfxp or some successor.
>>>
>>>
>>>
>>> Sean Hayes
>>> Media Accessibility Strategist
>>> Accessibility Business Unit
>>> Microsoft
>>>
>>>
>>>
>>> Office:  +44 118 909 5867,
>>>
>>> Mobile: +44 7875 091385
>>>
>>>
>>>
>>> From: public-tt-request@w3.org [mailto:public-tt-request@w3.org] On Behalf
>>> Of Geoff Freed
>>> Sent: 08 December 2008 14:09
>>> To: public-tt@w3.org
>>> Subject: caption/subtitle discussion on ogg accessibility list
>>>
>>>
>>>
>>> there's a lengthy discussion about captions/subtitles going on at the ogg
>>> accessibility list.  archives are available at
>>> http://lists.xiph.org/pipermail/accessibility/, or you can sign up at
>>> http://lists.xiph.org/mailman/listinfo/accessibility and join in.
>>>
>>> there has been some debate over what text-display format to support
>>> initially, and the group seems headed toward support of SubRip (srt).  i've
>>> expressed concern that doing so might initially limit the usefulness of ogg
>>> captions/subtitles, and have lobbied for the inclusion of dfxp from the
>>> beginning, rather than waiting until after srt support has been established.
>>>  you can see my comments in the archives.
>>>
>>> g.
>>
>>
>
>
Received on Tuesday, 9 December 2008 16:53:32 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Monday, 2 November 2009 22:41:39 GMT