W3C home > Mailing lists > Public > public-tt@w3.org > March 2005

RE: Timed Text Authoring Format - Distribution Format Exchange Pr ofile (DFXP) Streaming

From: <Johnb@screen.subtitling.com>
Date: Tue, 29 Mar 2005 17:57:39 +0100
Message-ID: <11E58A66B922D511AFB600A0244A722EE57DB3@NTMAIL>
To: shayes@microsoft.com
Cc: gadams@xfsi.com, public-tt@w3.org
Sean,
 
It's not that I didn't read it.....I interpreted the spec incorrectly. When
I saw Meta.class in the XML representation I interpreted that as meaning
that the element could take attributes from the metadata attribute
vocabulary.
 
Having just re-read the spec I am now even more unsure as to why you can
include any attributes from the TT:Metadata namespace within most content
elements and also be able to include multiple meta elements? Would it not be
clearer just to allow metadata only in meta elements? or only as attributes
within elements? NOT both?
 
Actually I'd suggest that the spec may be clear to the authors - but perhaps
not so clear to the rest of us mortals :-)
 
Sometimes you need to state things in 'real world' terms - and in the right
places.
 
Remember - most implementors will not be schema gurus - or even XML
lawyers.......
 
I have the benefit of having been involved in some of the discussions, and
of having an idea of some of the ambitions of the WG.
BUT most of the implementors will give the spec a cursory glance and then
implement based on any sample file using DFXP they can find....
You'll be lucky if they even look at the XSD IMO.
 
IMO If some part of the XSD is qualified by normative text outside of that
XSD - there should **at least** be a comment within the XSD to that effect.
 
John
 
 
 -----Original Message-----
From: Sean Hayes [mailto:shayes@microsoft.com]
Sent: 29 March 2005 17:03
To: Glenn A. Adams; Johnb@screen.subtitling.com
Cc: public-tt@w3.org
Subject: RE: Timed Text Authoring Format - Distribution Format Exchange Pr
ofile (DFXP) Streaming



Yes I understand that the spec is clear if you read it. My fear is that in
the real world, people aren't going to read the small print (John has
already demonstrated this :-) and really understand TT.  They are going to
load the XSD into XML Spy or some such tool and go generating thousands of
document instances. If they are an influential player, and produce enough
content then they become the de-facto standard. And we end up with another
HTML situation.

 

'A stitch in time' as my Mother would say...

 


  _____  


From: public-tt-request@w3.org [mailto:public-tt-request@w3.org] On Behalf
Of Glenn A. Adams
Sent: 29 March 2005 07:54
To: Sean Hayes; Johnb@screen.subtitling.com
Cc: public-tt@w3.org
Subject: RE: Timed Text Authoring Format - Distribution Format Exchange Pr
ofile (DFXP) Streaming

 

I don't see any reason to make the XSD (or RNC) schemas informative. Both
are normative in the sense that we consider them formally defined and
formally part of the specification. However, neither are normative in the
sense of being the benchmark for validation. I think we have clearly define
validity in a section 3 which is independent of a particular schema, which
was our intent.

 


  _____  


From: Sean Hayes [mailto:shayes@microsoft.com] 
Sent: Tuesday, March 29, 2005 10:51 AM
To: Glenn A. Adams; Johnb@screen.subtitling.com
Cc: public-tt@w3.org
Subject: RE: Timed Text Authoring Format - Distribution Format Exchange Pr
ofile (DFXP) Streaming

 

Yes I read this. This text is insufficient IMO. If we don't remove the XSD,
then we should at least

 a) Make it informative

 b) Put in 48 point red bold text that the XSD schema is known not to adhere
to the normative requirements of DFXP (but we included it anyway) and that
content exchange mechanisms are required to do additional work over and
above just XSD processing if they choose to use schema for validation.

 

Sean.


  _____  


From: Glenn A. Adams [mailto:gadams@xfsi.com] 
Sent: 29 March 2005 07:43
To: Sean Hayes; Johnb@screen.subtitling.com
Cc: public-tt@w3.org
Subject: RE: Timed Text Authoring Format - Distribution Format Exchange Pr
ofile (DFXP) Streaming

 

I don't think it will be practical to remove the XSD schema from DFXP.
Rather, we simply need to qualify the differences regarding validation. Note
that the compliance clause of DFXP is not based upon using any form of
schema validation, so it does not affect compliance. Note also that we  have
the following language under the header of Annex C:

 

"In any case where a schema specified by this appendix differs from the
normative definitions of document type, element type, or attribute type as
defined by the body of this specification, then the body of this
specification takes precedence."

 


  _____  


From: Sean Hayes [mailto:shayes@microsoft.com] 
Sent: Tuesday, March 29, 2005 10:39 AM
To: Sean Hayes; Glenn A. Adams; Johnb@screen.subtitling.com
Cc: public-tt@w3.org
Subject: RE: Timed Text Authoring Format - Distribution Format Exchange Pr
ofile (DFXP) Streaming

 

Furthermore, since the XSD in the draft contains an incorrect model, it
should be removed. 

 

I did think about putting warning language in, but it would get ignored 'for
convenience' and since XSD processing is more prevalent than RNG processing
right, now I'm sure we would end up with incorrect content floating around.

 


  _____  


From: public-tt-request@w3.org [mailto:public-tt-request@w3.org] On Behalf
Of Sean Hayes
Sent: 29 March 2005 07:28
To: Glenn A. Adams; Johnb@screen.subtitling.com
Cc: public-tt@w3.org
Subject: RE: Timed Text Authoring Format - Distribution Format Exchange Pr
ofile (DFXP) Streaming

 

OK, if its legal XML, and we are going to keep it, and it can't be expressed
in W3C schema; then I propose we make a TTWG input to this effect to the
upcoming W3C Schema users group meeting in June.

 


  _____  


From: Glenn A. Adams [mailto:gadams@xfsi.com] 
Sent: 29 March 2005 07:25
To: Sean Hayes; Johnb@screen.subtitling.com
Cc: public-tt@w3.org
Subject: RE: Timed Text Authoring Format - Distribution Format Exchange Pr
ofile (DFXP) Streaming

 

Yes it is legal in XML; however, neither DTD nor XML Schema supports
expression of this constraint. On the other hand, RNG does (and other schema
languages do). In our present case, the normative definition of content
models for compliance testing is based upon the XML Representation
specifications in the body of the specification, and not upon any specific
schema (or schema language).

 


  _____  


From: Sean Hayes [mailto:shayes@microsoft.com] 
Sent: Tuesday, March 29, 2005 10:16 AM
To: Johnb@screen.subtitling.com; Glenn A. Adams
Cc: public-tt@w3.org
Subject: RE: Timed Text Authoring Format - Distribution Format Exchange Pr
ofile (DFXP) Streaming

 

Actually you can have multiple meta child elements, however this does bring
up an issue I have been meaning to raise. The content model for <p> is
Meta.class*, Animation.class*, (#PCDATA|span|br)*

 

However I'm not sure it is legal in XML to restrict PCDATA to occur only
after a certain list of elements. It is not possible to express this in W3C
schema in any case.

 

We might want to consider relaxing the <meta> comes first rule.

 

Sean


  _____  


From: public-tt-request@w3.org [mailto:public-tt-request@w3.org] On Behalf
Of Johnb@screen.subtitling.com
Sent: 29 March 2005 07:23
To: gadams@xfsi.com; Johnb@screen.subtitling.com
Cc: public-tt@w3.org
Subject: RE: Timed Text Authoring Format - Distribution Format Exchange Pr
ofile (DFXP) Streaming

 

Glenn,

 

I hadn't caught that one :-) (Meta data at all levels). Does this mean that
you can put a meta child element under any other element? Presumably
restricted to a single child instance?

 

In which case, my remaining concern about creating multiple language DFXP
files is that there is insufficient headroom given the non nesting of div to
cater for the structure I anticipate needing. Why does div not nest?

 

regards John Birch.

-----Original Message-----
From: Glenn A. Adams [mailto:gadams@xfsi.com]
Sent: 29 March 2005 15:53
To: Johnb@screen.subtitling.com
Cc: public-tt@w3.org
Subject: RE: Timed Text Authoring Format - Distribution Format Exchange Pr
ofile (DFXP) Streaming

Since we don't (and won't) define a DFXP UA, it is up to whomever defines a
UA to determine whether user specified style sheets or transforms may apply.
In general, I don't see why they should not.

 

I'm not sure what you mean by "separate metadata for each language". You can
express whatever metadata you want at whatever granularity you wish (since
every content element can take meta children which can contain arbitrary
metadata constructs.

 


  _____  


From: Johnb@screen.subtitling.com [mailto:Johnb@screen.subtitling.com] 
Sent: Tuesday, March 29, 2005 10:05 AM
To: shayes@microsoft.com; Glenn A. Adams
Cc: public-tt@w3.org
Subject: RE: Timed Text Authoring Format - Distribution Format Exchange Pr
ofile (DFXP) Streaming

 

Sean,

 

Hi...

 

This approach is one I am considering for conditional content ....
'watershed words'

 

I don't favour it for language selection because it doesn't address the
issue of having separate metadata for each language. I view that as
important since there may be rights issues (e.g. copyright and distribution)
that are on a **per language basis**.

 

For conditional content this works quite well, as it is trivial (in concept)
to modify the style definitions.... so taking your example and twisting
slightly gives.... (note: it's set for after 8:00pm :-)

 

<styling>

    <style id="before8pm" tts:display="none" />

    <style id="after8pm" tts:display="auto" />

</styling>

...

<div>

    <p>So I told him to <span style="before8pm">"Go away!"</span><span
style="after8pm">"Piss Off!"</span></p>

</div>

 

Note - I anticipate in **most** cases conditional content will be ...
inline...

 

Of course - this solution works if we anticipate an interpretation of the
DFXP pre-delivery. It does not work for DFXP as a delivery format UNLESS it
is assumed that a UA can apply a user defined stylesheet to a DFXP document
or otherwise modify the DFXP document prior to display (Comments Glenn?)

 

regards 

John Birch

 

 -----Original Message-----
From: Sean Hayes [mailto:shayes@microsoft.com]
Sent: 29 March 2005 15:25
To: Johnb@screen.subtitling.com; gadams@xfsi.com
Cc: public-tt@w3.org
Subject: RE: Timed Text Authoring Format - Distribution Format Exchange Pr
ofile (DFXP) Streaming

How about an approach like the following:

 

<styling>

    <style id="lang1" tts:display="none" />

    <style id="lang2" tts:display="auto" />

    <style id="lang3" tts:display="none" />

</styling>

...

<div>

    <p style="lang1">Bonjour</p>

    <p style="lang2">Ola</p>

    <p style="lang3">Hello</p>

</div>

 

Here, you only have to change the display property in the selected language
and you get the bits you need. The same approach should work for watershed
words, etc.

 

Sean.

 


  _____  


From: public-tt-request@w3.org [mailto:public-tt-request@w3.org] On Behalf
Of Johnb@screen.subtitling.com
Sent: 29 March 2005 06:32
To: gadams@xfsi.com; Johnb@screen.subtitling.com
Cc: public-tt@w3.org
Subject: RE: Timed Text Authoring Format - Distribution Format Exchange Pr
ofile (DFXP) Streaming

 

Glenn,

 

The 'problem' with your suggested approach **for me** is that all of the
parallel languages would share a common head section (which contains the
layout and styling elements). This would make combining languages into a
composite multi-language document diificult - imagine if the language to be
appended contains style references that match existing ones. Further -
extraction also becomes more complex, as it would be necessary/desirable to
reduce the head element to only those element instances that are referenced
by a specific language. Secondly - since the div element cannot be nested,
use of the div element to separate parallel languages, as would be logical
for my anticipated use, would effectively remove the ability to use div for
any other structural purpose (such as separating program segments).

 

Using annotations for filtering content strikes me as a rather 'weak'
approach to solving my requirement.... it also conflicts with other
potential uses for the role element - e.g. identification of the 'type' of
text it annotates (dialogue, lyrics, description).... and any profile using
a 'ttm:role' based styling mechanism.

 

I think it more likely that it will be necessary to generate a profile for
DFXP, and probably a DFXP wrapper format to handle the multi-language issue
to satisfy my (and others) requirements.

 

regards

John Birch

 

 

-----Original Message-----
From: Glenn A. Adams [mailto:gadams@xfsi.com]
Sent: 29 March 2005 14:30
To: Johnb@screen.subtitling.com
Cc: public-tt@w3.org
Subject: RE: Timed Text Authoring Format - Distribution Format Exchange Pr
ofile (DFXP) Streaming

In point 1, I mean DFXP. You could, e.g., place parallel languages in
separate div, p, span elts, etc., although this is not a recommended usage
for interchange. Then you could use XSLT/XQuery, etc., between your archive
and over-the-air inserter (where presumably it would be transformed into
some final distribution format, e.g., DVB Subtitles). Also, you could do
something similar for annotating content to be filtered in the transform
step, e.g., ttm:role="x-adult".

 


  _____  


From: Johnb@screen.subtitling.com [mailto:Johnb@screen.subtitling.com] 
Sent: Tuesday, March 29, 2005 8:41 AM
To: Glenn A. Adams
Cc: public-tt@w3.org
Subject: RE: Timed Text Authoring Format - Distribution Format Exchange Pr
ofile (DFXP) Streaming

 

Hi Glenn,

 

I'm not sure I understand your response?

 

In point 1, do you mean AFXP? cf DFXP. Alternatively, how would you suggest
structuring a multi-language DFXP document?

 

w.r.t. point 2, I have perhaps created confusion by referring to a timedtext
stream. I did not intend to imply that the content of that element was
intended for streaming in the internet sense of the word..... rather I used
the term stream as analogous to 'thread'.

 

regards 

John Birch. 

-----Original Message-----
From: Glenn A. Adams [mailto:gadams@xfsi.com]
Sent: 29 March 2005 14:13
To: Johnb@screen.subtitling.com
Cc: public-tt@w3.org
Subject: RE: Timed Text Authoring Format - Distribution Format Exchange Pr
ofile (DFXP) Streaming

1.	In the main archive, you could have a single DFXP document that
combines languages and usages (adult/child), and then use an XSLT transform
(or XQuery) to select the portions required for a "send to air" document. 

2.	While the TTWG does consider streamability to be a necessary
property of DFXP, it drew the line at actually defining a streaming form,
which was considered out of scope; however, there is nothing to prevent a
future specification (either in or out of W3C) from defining such a form. 

 

G.

 


  _____  


From: Johnb@screen.subtitling.com [mailto:Johnb@screen.subtitling.com] 
Sent: Tuesday, March 29, 2005 7:22 AM
To: Glenn A. Adams
Cc: public-tt@w3.org
Subject: RE: Timed Text Authoring Format - Distribution Format Exchange Pr
ofile (DFXP) Streaming

 

Glenn,

 

Current practice for subtitling in broadcast TV is to hold an archive of all
subtitle files for all material that has been, will be, or may be broadcast.

This can amount to many tens of thousands of files. (David can probably give
you a number for the BBCs archive!)

 

Current practice (at least for us) is to combine all individual language
files into a single multi-language package for a given program.

 

So, subtitle files are originated by subtitlers in a single language - and
transferred, QA'd and then typically combined into a multi-language 'air'
file.

These 'air' files are then held in a 'subtitle archive' that can be accessed
by the insertion systems when station automation requests the playout of a
particular piece of material. Typically for a European operation there may
be on average 4-6 languages present in each multi-language file (although we
have systems with many more langauges per channel than this).

 

There are many models being discussed within the ad-hoc committee, doubtless
there will be a transition interval where DFXP content is held externally to
the media content. Indeed it may be (for operational reasons) that the
combined MXF/AAF with subtitles incorporated internally is only used as a
'between broadcaster' format - not as a near to air format.

 

So, a nominally single language DFXP could result in a proliferation of
files (probably by a factor of 4 - 8) for broadcasters. Note - we are
assuming that insertion equipment will move across to using DFXP
**directly** here.

 

By onerous, there are implementation issues to consider. The increase in the
number of files creates a subtle problem. The files have to be referred to
by the automation equipment, changing from a multi-lingual system to a
single language per file concept means that either the automation system has
to send multiple demands to the insertion equipment (for each language) -
changing the whole concept of the automation interface, or the insertion
equipment has to determine which individual DFXP files constitute the
fileset for a given material reference. It is unlikely that many
broadcasters will wish to make changes to station automation... this is VERY
much an area of "If It Aint Broke Don't Fix It" - by which I mean there is a
strong resistance to messing with such a critical aspect of a broadcasters
operation.

 

So we can fairly safely assume that the insertion system will need to expand
a single material reference into a fileset. This in itself doesn't sound to
difficult until you consider that the system will need to be created and
maintained by human operators!. At present there is one point of potential
failure - the appending of a new subtitle language 'stream' into the
archive. With the multiple files approach dictated by DFXP's limitation to
single language -  more opportunies can arise for problems.

 

So - single language DFXP increases the number of files to handle (by
perhaps a factor of 4 - 8), and the omission of a conditional content
mechanism may multiply that again....

 

BTW 

Is there any practical reason why DFXP couldn't be multi-stream, or is it
simply a philosophical issue? What (apart from the schema) prevents a DFXP
document having effectively more than one instance of the tt element
structure?

 

e.g. (introduction of element tts "timedtext stream")

 

<tt xmlns="http://www.w3.org/2004/11/ttaf1">
<tts xml:lang="fr-fr">
  <head>
    <meta/>
    <styling/>
    <layout/>
  </head>
  <body/>
</tts>

<tts xml:lang="en-uk">
  <head>
    <meta/>
    <styling/>
    <layout/>
  </head>
  <body/>
</tts>

<tts xml:lang="en-uk-caption">
  <head>
    <meta/>
    <styling/>
    <layout/>
  </head>
  <body/>
</tts>

</tt>

 

regards 

John Birch.

 

 -----Original Message-----
From: Glenn A. Adams [mailto:gadams@xfsi.com]
Sent: 29 March 2005 12:28
To: Johnb@screen.subtitling.com
Cc: public-tt@w3.org
Subject: RE: Timed Text Authoring Format - Distribution Format Exchange Pr
ofile (DFXP) Streaming

Could you describe what you mean by "subtitle archive" and "onerous to
require ..."?

 


  _____  


From: Johnb@screen.subtitling.com [mailto:Johnb@screen.subtitling.com] 
Sent: Tuesday, March 29, 2005 3:47 AM
To: Glenn A. Adams; russ.wood@softel.co.uk; public-tt@w3.org
Subject: RE: Timed Text Authoring Format - Distribution Format Exchange Pr
ofile (DFXP) Streaming

 

Glenn,

 

An issue that was discussed recently at the AAF/MXF EBU ad-hoc subtitle
commitee....

 

While the generation of multiple DFXP 'files' for individual languages is an
acceptable solution, I feel there may yet be a requirement for a
'lightweight' conditional content mechanism. The specific example I have in
mind is to support the concept of viewing 'watersheds' - i.e. content
unsuitable for minors.

In this case the majority of a subtitle file would be suitable for all
viewers - but the odd word or phrase may be 'sanitised' for pre watershed
(e.g. 8.00pm) airings of the programme. It would be onerous to require a
subtitle archive to retain multiple copies of content to cater for just the
alteration of one of two words in a 1300 line subtitle file. Is there any
possibility of introducing a conditional content facuility to DFXP that
would support this kind of minor use?

 

A second use of this mechanism, which might be a stretch too far, is to
support subtitle files that can be used as captions (i.e. near verbatim +
sound cues) or as subtitles. In this case the conditional content may be the
'sound cues' and possibly the replacement of some of the subtitle lines with
less accurate (but more concise!) translations.

 

best regards 

John B.

-----Original Message-----
From: Glenn A. Adams [mailto:gadams@xfsi.com]
Sent: 26 March 2005 05:47
To: Russ Wood; public-tt@w3.org
Subject: RE: Timed Text Authoring Format - Distribution Format Exchange
Profile (DFXP) Streaming

DFXP supports general use of xml:lang attribute in order to (1) specify a
default language for document instance and (2) to annotation language of
nested content. It is up to the author to decide how to use this mechanism.
For example, an author could potentially specify different <div/> elements
using different languages, or different <p/> elements, etc. Nonetheless, the
intention is not to explicitly support in DFXP conditional content selection
based on preferred language. In contrast, conditional content selection will
be supported in AFXP. The intent with DFXP is to have already made all
conditional selections prior to transmitting/exchanging in DFXP format. This
means that if an AFXP document supports course granular conditional
selection between parallel language representations, then one may produce
multiple DFXP document instances from a single AFXP document instance, by
enumerating over the condional parameter space (of which each permutation
may produce a distinct DFXP document instance).

 

Regards,

Glenn

 


  _____  


From: Russ Wood [mailto:russ.wood@softel.co.uk] 
Sent: Monday, March 21, 2005 5:36 AM
To: public-tt@w3.org
Subject: RE: Timed Text Authoring Format - Distribution Format Exchange Pr
ofile (DFXP) Streaming

 

3) I don't see a problem with allowing different languages in the same
document but amalgamating different language files at run time is not
difficult.

 
Received on Tuesday, 29 March 2005 16:41:24 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Monday, 2 November 2009 22:41:33 GMT