RE: XSL and CSS Re: Coments - last call draft

Hi Glenn,

I think I will now focus my attention on how to extend DFXP to meet my
goals.

I have concerns that AFXP will not surface (which is part of the reason for
pushing these issues so hard).

I had therefore hoped that DFXP would include more of (what I think might be
in) AFXP.

 

Please be assured that I hold the work of the TTWG in the very highest
regard, 

the DFXP specification as it currently stands is a great foundation.

 

I am unable to formally join the WG - it is too expensive as a personal
undertaking - and my company has considered and rejected membership.

I am most grateful for the invitations you have extended to be a guest at
your meetings, 

but this is the only forum I really have to influence AFXP - hence the
torrent unleashed at the official launch of the DFXP draft :-)

 

with best regards,

and the greatest respect,

 

John Birch.

 

 -----Original Message-----
From: Glenn A. Adams [mailto:gadams@xfsi.com]
Sent: 04 April 2005 21:46
To: Johnb@screen.subtitling.com
Cc: public-tt@w3.org; charles@sidar.org; Alfred.S.Gilman@IEEE.org
Subject: RE: XSL and CSS Re: Coments - last call draft



 

 


  _____  


From: Johnb@screen.subtitling.com [mailto:Johnb@screen.subtitling.com] 
Sent: Monday, April 04, 2005 1:09 PM
To: Glenn A. Adams
Cc: public-tt@w3.org; charles@sidar.org; Alfred.S.Gilman@IEEE.org
Subject: RE: XSL and CSS Re: Coments - last call draft

 

Glenn

See inline below.

I wrote:

The DFXP style model is quite suitable for the carriage of styled text, BUT,
in the contexts of accessibilty and transcoding, the DFXP style mechanism
IMO lacks an essential ingredient, that being the reason for (or context of)
the applied style.

As an example - an author may choose yellow text on a red background for a
warning message.

The carriage of that text as simply text characters and colour codes loses
one piece of information - the fact that it was intended as a warning.

[GA] It is trivial to include arbitrary user-defined metadata in DFXP. One
can also use user-defined values for the ttm:role attribute. In both cases,
you have a means to express and interchange additional intentionality. It is
far from clear what additional standardization in this area may be warranted
in DFXP at this time.

[JB]The key words here are 'arbitrary' and 'user-defined', rather than
standardised (as in CEA 708 - 8.5.9 Caption Text Function Tags). I had hoped
that DFXP would include a formalisation of a text context mechanism that
could be associated with presentation. Note I am not suggesting that it is
in scope to define a fully inclusive set of attribute values to define all
possible text contexts (although some could be included), but I do think it
in scope to define a mechanism for effectively associating arbitrarily
complex contexts with text - the namespace mechaism can handle the issue of
defining the context. Similarly a mechanism could also be formalised that
allows the association of context with style.

 

[GA] Well, we have defined ttm:role as just such a mechanism, and we have a
set of standard values (and plan to harmonize these with the CEA 708 set of
roles). Are you suggesting additional standardized values or a different
mechanism? You can always use RDF to annotate any DFXP construct by
embedding in a <meta/> child.

 

In general, I think it ill-advised for the TT WG to undertake  an attempt to
create another metadata system for capturing semantics, particularly when
there are already a number of mechanisms supported by DFXP's current
formalism.

 

I would also note that you can use a naming convention in DFXP to express
context, e.g.,
<style id="warning" tts:color="red"/>
...
<span style="warning">Don't Panic!</span>

Here - the choice of id value is arbitrary, and could only be restricted by
convention. As such, this is contrary to my concept of a universal
interchange format.

 

[GA] If you are asking for "universal semantics" interchange, then I'm
afraid you are defintely not going to get it from DFXP without adding
additional layers or profiles that bring in other metadata systems. Please
lower your expectations about DFXP, and you will feel much better about it.

 

[GA] I'm not sure what you mean by "style tagging (context)", so I cannot
say if it would be a feature or not. What is planned for AFXP is
"applicative styling", which allows using a "select" attribute on a style
element or on a group of style element, where the value of "select" is an
XPath expression that selects content elements to which  the styling is to
be supplied. I'm not sure how this relates to your phrase "style tagging".

Not at all I fear :-( This sounds like just an another mechanism for hiding
or reducing the number of style definitions......

 

Sorry - on reflection "style tagging (context)" is a rather woolly
phrase....

 

What I am suggesting is a means of associating a context with text content,
and also associating that context with styling - such that users of the
document can associate a style with the text content (be it for
transcoding/translation or for display), where the style is
defined/influenced by the context (hierarchy). So axes on the context
'graph' might include 'role' and 'emotion' and 'prosody'. Note I am not
specifically proposing a mechanism here - just trying to describe the
concept. You might suggest that these concepts have only a peripheral role
in timed text, I would suggest that they have an incredibly valuable role in
subtitling and accessibility. 

 

[GA] Your goal will be perfectly supported by AFXP by means of a combination
of metadata (for which we will depend on you to create an additional
standard or profile regarding usage thereof) and the AFXP applicative
styling mechanism, which, using XPath, will allow you to conditionalize
application of style based on complex predicates expressed in XPath that use
both content and metadata to select stylable content. 

 

The 'rules' for associating context and style need not be applicative in the
way that CSS implements selection based styling, they can create a
pre-determined hierarchy in the head of the document. The context of text
content cannot change after authoring - so it is similar to the DFXP
referential style concept. What might change is the users requirement for
how text associated with that context is presented (or indeed if it is). By
including more support for context - you can achieve a more acceptable
presentation of the document to a wider audience, for example the inclusion
of 'prosody' information might allow better (re-)speaking of the content.
Inclusion of 'role' allows filtering... and so on.

 

[GA] We expect AFXP to support multiple internal or external style sheets
that contain the mapping; as a result, the user's requirements to flexibly
associate or change between associations of content and its style can be
effectively implemented. 

 

I guess I am disappointed that this is seen as optional - rather than as
fundamental to Timed Text in general.

 

It has been stated that:

"The intent with DFXP is to have already made all conditional selections
prior to transmitting/exchanging in DFXP format."

 

This has important implications for TV subtitles. DFXP is currently under
consideration as a foundation for containing subtitles within MXF / AAF
media packages for use in TV and Digital Cinema. While making selections
prior to transmission or exchange is reasonable, it is not so reasonable to
make these selections prior to the storage of an asset.

 

[GA] The process of asset storage and the policies applied there is
effectively outside the scope of TT AF in general. Nevertheless, you may
wish to consider use of AFXP as a potential storage asset that can be
subjected to dynamic, even real-time, mappings to DFXP for either direct
delivery or subsequent transcoding to a legacy distribution format.

 

This is because the circumstances affecting the selection may change between
the storage of the asset and its subsequent transmission. In effect this
DFXP constraint implies that using 'pure' DFXP as the storage format would
require that all possible outcomes of the selection process be stored as
separate DFXP files within the asset package - e.g. a file for each language
- plus a file for each conditional content switch (e.g. caption/subtitle,
pre-watershed/post watershed). This is sub-optimal.

 

[GA] It is not sub-optimal from the perspective of the goals of DFXP or the
simplicity of its format and processing models. You are asking to expand the
scope of DFXP. 

 

Conditional content could be implemented using text context and associated
styling.

 

It should be noted that CEA-608/708, and WST (and in fact TV subtitling
formats in general) are typically not stored in these wire formats by
broadcasters, rather these wire distribution formats are created in
real-time by insertion equipment working from proprietary file formats. A
single common file format already exists as a ratified interchange standard,
EBU 3264. DFXP could replace the use of EBU 3264 - it offers a few of
advantages, a) it is Unicode, b) it is XML and c) it has a more
comprehensive language tagging mechanism. However, DFXP does not offer any
significant new features over EBU 3264, and indeed there are features in
EBU3264 that are not present in DFXP (e.g. cumulative mode and boxing).

[GA] I'm not sure what you mean by "cumulative mode" or "boxing", so I can't
say whether these are supported in DFXP or not.

[JB]Cumulative mode rests upon the concept of a 'cursor position' - such
that subsequent text can be appended to text already in view. DFXP can
emulate the output of a cumulative subtitle file, but does not necessarily
capture the fact that fragments of text form a complete subtitle (except
indirectly by virtue of the fact that they share a common end time).

 

[GA] Based on your explanation, I believe that DFXP does support cumulative
mode, although the details of this support have (which will appear in Annex
B. In particular, I expect that by means of temporally activating content
that is appended to the current content of a region undergoing dynamic flow,
e.g., by appending a <span/>, a <par/>, a <div/> into a region, then the
newly activated content can participate in the content available for flowing
into the region.

 

For example, one might have the following scenario:

 

<region id="r1">
  <style tts:overflow="dynamic"/>
  <style tts:dynamicFlow="in(word) out(line)"/>
</region>
...

<p region="r1" begin="10s" dur="10s">
  <span>Some Text</span>
  <span>Some More Text</span>
</p>

 

If the entire content of this paragraph is available for dynamic flow, then
it is as if no cumulative mode applies. However, say that you have chosen a
fragment based streaming representation of this document's infoset, e.g., by
using MPEG-7 Part 1 BiM or equivalent. In this case, you might have three
fragments that you transmit:

 

F0 - contains <p> start tag, but no children

F1 - contains 1st <span/> and its character items

F2 - contains 2nd <span/> and its character items

 

In this scenario, streaming decoder could start the dynamic flow on the
region based upon the arrival of <p>, and then append the <span/> content,
for being dynamically flowed.

 

Conversion between a cumulative mode subtitle file, and a non cumulative
mode file represented by DFXP is thus made more difficult - since the
grouping of fragments is lost. You could adopt a 'convention' where a <P>
element always contains a complete subtitle - but this is then mixing two
concepts together, reducing the usefulness of the <P> element. This is
because conversion between presentations that allow different numbers of
displayed lines and characters requires a distinction between logical text
boundaries (paragraphs) and the arbitrary boundaries imposed on the text by
the limitations of the subtitles mechanism. So conversion between 2 row line
21 captions and 3 row Teletext captions should use <p> as a logical division
in the text - when reformating 2 row subtitles into a 3 row format.

 

Put another way - cumulative mode is a 'cooked' way of pacing the display of
text to the user.

 

Boxing is the issue of background colour only behind glyphs, not for the
whole region (see my earlier email (sent Wed 16/03/2005 17:36) regarding
extending the values for the show-background attribute).

 

[GA] DFXP already supports separation of background color for content
separately from background color for region. In fact, there are, at present,
five distinct background colors that may apply, which, when using alpha
components and opacity, may result in a total of five degrees of background
layering. Those five are: region, body, div, p, and span.

 

It is not yet clear that DFXP does not support the effects you have asked
for in your proposed extension for showBackground. The TTWG will be
discussing this matter more at our upcoming F2F to determine if there are
features we want to add in DFXP to support more complex background painting
scenarios. 


A combination of extension elements and attributes and constrained document
structuring (via a sub-profile) can probably be used with DFXP to fully
represent EBU 3264 document contents - and other general TV broadcast
related subtitling issues. Indeed, it is anticipated that the use of DFXP as
an interchange mechanism for TV broadcast subtitling will require the
development of guidelines for the interpretation of DFXP documents by
transcoders. In addition it will probably require the development of a
profile to add elements and attributes to DFXP to carry information and
features currently supported by existing formats, (e.g. conditional content,
cumulative modes, background styles, embedded glyphs, subtitles as images
(DVD, DVB, Imitext)).

The pressing need is not IMO for another interchange format per se, rather
it is for a format that preserves more of the authorial intent (inc.
understanding / meaning) such that implementing transcoding, translation and
accessibility are made easier tasks than they are currently. My main
concerns are that using DFXP will encourage the continuation of the existing
practice of 'cooked text content' - that is text that has lost contextual
meaning - and that AFXP will be too complex and too late for most
implementations.

Is there a middle path for DFXP that would encourage a more context
sensitive (and accessible) role for text style? DFXP already includes a
referenced style mechanism - could that mechanism be strengthened to provide
greater support for contextual styling of text?

[GA] You are asking to expand the scope of DFXP from its express role as a
useful subset for interchange among existing legacy formats to a role
approaching AFXP. In other words, you are effectively asking the TT WF to
drop its work on a subset that could serve an immediate purpose and be a
stepping stone to a more general solution. I can't imagine the TT WG
changing its course on this point, but we will discuss your comments and
respond formally with a consensus position.

I fully understand the position of the TTWG, however, I have strong
reservations as to how effective DFXP is as a stepping stone to AFXP - when
DFXP essentially bypasses most of the 'harder' problems, that I hope AFXP
will address, and leaves no obvious placeholders for them to fit into.

 

[GA] DFXP is a stepping stone in the sense that it is a proper subset of
AFXP (as intended) and that it entails compiling information that may be
richer in AFXP into a flatter structure in DFXP. 

 

I don't want TTWG to drop the work on DFXP - far from it - but I am
uncertain as to the the larger role for a format that provides the same
level of functionality as the existing legacy formats - but includes few
features that support and extend the concept of universal content.

 

[GA] Think of it as an essesntial step in the W3C standardization process.
It enables the TT WG to show real progress, that, at least according to the
sentiments of the member of the WG, has a real and concrete value. 

 

I would be delighted however if DFXP showed a turn away from the markedly
'cooked' approach it has (to style in particular). 

 

[GA] It will not. 

 

IMO DFXP is currently in short - far too presentation centric.

 

[GA] It was designed that way.

 

DFXP would IMO be considerably more useful if it explicitly provided more
support for 'soft' styling of the text content (and promoted the concept).

 

[GA] Use metadata in combination with a complex transform of your design.

 

I believe that DFXP will be adequate to interchange the current web based
formats, and with some tweaks (by profile or convention or both) will be
able to interchange TV broadcast subtitle files. In that respect DFXP has
met its goals.

 

[GA] DFXP was not designed for interchange amongst web based formats per se,
but among SAMI, QT, RT, 3GPP TT, 608/708, and to some extent, WST, none of
which are particularly designed as web based formats.  The additions you are
asking for, from my understanding, go considerably beyond "some tweaks".
Please feel free to develop profiles or standardized conventions on top of
DFXP.

 

Finally - most of these concepts that I am alluding to are not present in
any existing legacy formats, I wish that they were.

 

[GA] Which is precisely why they have been given less priority. While the
task of defining support for new features is very interesting, it takes
considerable time and effort, and requires direct and constant
representation by the parties that advocate the features. Ultimately, the
results of the TT WG are based upon what its members are willing to invest
in time and money. The membership is always open to new participants that
advocate their specific interests.

 

 Subtitle files formats typically are cooked - the text smashed into
arbitrary units (subtitles) with hard styling applied. It is my frustration
at dealing with the conversion of these files between systems/formats that
has prompted my 'crusade' for more abstraction within DFXP/AFXP.

 

[GA] Please focus the attention of your crusade upon AFXP, and please join
the TT WG if you want to see your goals implemented.

 

I guess I am just disappointed that DFXP is unlikely to make these issues
any easier and concerned that standard (non-profiled) DFXP will perpetuate
the problem by becoming adopted as yet another cooked format.

 

[GA] DFXP was designed explicitly to be a cooked format to facilitate
interchange amongst other cooked formats. This is precisely what the TT WG
was chartered to accomplish. As the chairman of the WG, I have to insist
that the group focus its very limited resources on accomplishing only and no
more than our chartered work, and do so in a very expeditious process that
emphasizes results over the design of new features and new technology,
however, nice it would be to accomplish.

 


best regards
John Birch.

Received on Tuesday, 5 April 2005 14:53:42 UTC