Re: new issue? dfxp and language selection

Hi all,

I'd like to give some input on this topic and make sure we consider
the whole picture that we are dealing with here.

I am actually against changing the way in which DFXP is currently
specified - I think the current specification is the best way to deal
with multi-lingual data and I'd like to offer some reasons from
different perspectives for this.


1. First, let me give my view from a Web browser and video container
format point of view.

Generally, a video consists of multiple tracks of content - that may
be multiple audio and video tracks, but also multiple annotation
tracks that may be alternatives based on language or for some other
reason.

The obvious way of dealing with video in a Web browser is that it
receives a video stream with multiple tracks and has the possibility
to turn off tracks dynamically (based on user settings or user
interaction).

If we want to make sure that TimedText can be used as annotation
tracks inside videos, we need to make sure that we do not mix multiple
alternative annotations inside on TimedText track.

The easiest way of supporting this is to author multiple DFXP files
and have each of them provide input to an annotation track. This is
the way in which DFXP is currently specified, IIUC.

An alternative option is to have all the annotation tracks inside one
DFXP file, but the "encoder" (the program the encapsulates the
annotations inside the media file) splits up the annotations and
creates different tracks for each alternative. This would accommodate
the way in which ccPlayer currently works.

This latter way if much more complicated and error-prone. Also, a
"decoder" would not know if it should extract all annotation tracks
into a single file, or each in a separate one, since they may have
been added consecuitvely to the file.


2. My opinion from a text format point of view.

Assuming instead we will use DFXP only as a text file - for authoring
annotation tracks and for transmitting them over networks.

It is most certainly easier to only have one file that has all the
annotation tracks.

Let's look at the given specification:

<body>
 <div xml:lang='en'>..</div>
 <div xml:lang='ja'>..</div>
 <div xml:lang='fr'>..</div>
 ...
</body>

Is the idea to create body-level div elements that are alternatives to
each other simply by omitting start and end time specifiers on them?
What if these div tags had start and end time specifiers?

<body>
 <div xml:lang='en' start=0 end=5>..</div>
 <div xml:lang='ja' start=5 end=9>..</div>
 <div xml:lang='fr' start=9 end=15>..</div>
 ...
</body>

This is a totally different semantic - it specifies consecutive
annotations that are in different languages.

If we really wanted to specify a "select"-statement, we need a
different approach (I think that is what Dave Singer also proposed).
Note that I am not proposing such - I much prefer the idea of having
different files for alternative content.


3. A content selection viewpoint.

Let me go back to the Web Browser / Network use case. What we really
want for a video player in a Web Browser is to find out about a media
file on the server what content tracks it has, without receiving all
the data itself. Once this is known, the Web Browser could request
only those tracks that the user agent really wants to receive and thus
transfer less data over the network. This kind of content selection is
actually also being investigated in the Media Fragments Working Group
right now, e.g.
http://www.w3.org/2008/WebVideo/Fragments/wiki/Types_of_Fragment_Addressing
.

What I am trying to get to is that we may be missing in all our Web
video work at the W3C a feature to provide content information about a
media file over the network. This could be done with a similar
approach to the .rm files from RealNetworks, which just describe the
actual content, but don't contain them. In Xiph, the recently
developed ROE specification works toward a similar goal. I'm sure
other frameworks have similar approaches, but there's nothing
standard.

We should consider if content selection is really a problem of this
Working Group or whether it needs to be attacked at a different level.


Just my thoughts on the topic - I'd like us to avoid taking a simple
solution now, which may restrict us in a the long run. This decision
needs to be thought through well.

Best Regards,
Silvia.


On Thu, Dec 4, 2008 at 3:57 AM, Sean Hayes <Sean.Hayes@microsoft.com> wrote:
>
> In earlier discussions I believe we came to the conclusion that for multi lingual scenarios, it would be better to have separate files for each language. The xml:lang usage on elements was to clarify the use where one was momentarily switching languages, e.g. in a quotation, but where it was part of the same discourse.
>
> I think in fact the ccPlayer behaviour fails to adhere to the processing specified by section 9.3, which does not specify tree pruning based on language, and thus is not acting in accordance with the spec which would require simultaneous presentation of all three languages.
>
> We can certainly clarify this in the definition of the xml:lang attribute, but I believe we should track this as an implementation error by ccPlayer.
>
> Sean Hayes
> Media Accessibility Strategist
> Accessibility Business Unit
> Microsoft
>
> Office:  +44 118 909 5867,
> Mobile: +44 7875 091385
>
>
> -----Original Message-----
> From: public-tt-request@w3.org [mailto:public-tt-request@w3.org] On Behalf Of Philippe Le Hegaret
> Sent: 03 December 2008 15:54
> To: public-tt@w3.org
> Subject: new issue? dfxp and language selection
>
>
> I noticed that the ccPlayer is able to handle multiple languages in the
> same document:
>
> <body>
>  <div xml:lang='en'>..</div>
>  <div xml:lang='ja'>..</div>
>  <div xml:lang='fr'>..</div>
>  ...
> </body>
>
> You can then select which language to display using the interface.
>
> It's allowed by the specification but nothing there says that you can
> display only one language.
>
> Do we need to say to say anything in the spec about such usage?
>
> Philippe
>
>
>
>
>
>

Received on Wednesday, 3 December 2008 23:32:37 UTC