RE: XSL and CSS Re: Coments - last call draft

 

 

  _____  

From: Johnb@screen.subtitling.com [mailto:Johnb@screen.subtitling.com] 
Sent: Monday, April 04, 2005 1:09 PM
To: Glenn A. Adams
Cc: public-tt@w3.org; charles@sidar.org; Alfred.S.Gilman@IEEE.org
Subject: RE: XSL and CSS Re: Coments - last call draft

 

Glenn

See inline below.

	I wrote:

	The DFXP style model is quite suitable for the carriage of
styled text, BUT, in the contexts of accessibilty and transcoding, the
DFXP style mechanism IMO lacks an essential ingredient, that being the
reason for (or context of) the applied style.
	
	As an example - an author may choose yellow text on a red
background for a warning message.
	
	The carriage of that text as simply text characters and colour
codes loses one piece of information - the fact that it was intended as
a warning.
	
	[GA] It is trivial to include arbitrary user-defined metadata in
DFXP. One can also use user-defined values for the ttm:role attribute.
In both cases, you have a means to express and interchange additional
intentionality. It is far from clear what additional standardization in
this area may be warranted in DFXP at this time.

	[JB]The key words here are 'arbitrary' and 'user-defined',
rather than standardised (as in CEA 708 - 8.5.9 Caption Text Function
Tags). I had hoped that DFXP would include a formalisation of a text
context mechanism that could be associated with presentation. Note I am
not suggesting that it is in scope to define a fully inclusive set of
attribute values to define all possible text contexts (although some
could be included), but I do think it in scope to define a mechanism for
effectively associating arbitrarily complex contexts with text - the
namespace mechaism can handle the issue of defining the context.
Similarly a mechanism could also be formalised that allows the
association of context with style.

	 

	[GA] Well, we have defined ttm:role as just such a mechanism,
and we have a set of standard values (and plan to harmonize these with
the CEA 708 set of roles). Are you suggesting additional standardized
values or a different mechanism? You can always use RDF to annotate any
DFXP construct by embedding in a <meta/> child.

	 

	In general, I think it ill-advised for the TT WG to undertake
an attempt to create another metadata system for capturing semantics,
particularly when there are already a number of mechanisms supported by
DFXP's current formalism.

	 

	I would also note that you can use a naming convention in DFXP
to express context, e.g.,
	<style id="warning" tts:color="red"/>
	...
	<span style="warning">Don't Panic!</span>

	Here - the choice of id value is arbitrary, and could only be
restricted by convention. As such, this is contrary to my concept of a
universal interchange format.

	 

	[GA] If you are asking for "universal semantics" interchange,
then I'm afraid you are defintely not going to get it from DFXP without
adding additional layers or profiles that bring in other metadata
systems. Please lower your expectations about DFXP, and you will feel
much better about it.

	 

	[GA] I'm not sure what you mean by "style tagging (context)", so
I cannot say if it would be a feature or not. What is planned for AFXP
is "applicative styling", which allows using a "select" attribute on a
style element or on a group of style element, where the value of
"select" is an XPath expression that selects content elements to which
the styling is to be supplied. I'm not sure how this relates to your
phrase "style tagging".

	Not at all I fear :-( This sounds like just an another mechanism
for hiding or reducing the number of style definitions......

	 

	Sorry - on reflection "style tagging (context)" is a rather
woolly phrase....

	 

	What I am suggesting is a means of associating a context with
text content, and also associating that context with styling - such that
users of the document can associate a style with the text content (be it
for transcoding/translation or for display), where the style is
defined/influenced by the context (hierarchy). So axes on the context
'graph' might include 'role' and 'emotion' and 'prosody'. Note I am not
specifically proposing a mechanism here - just trying to describe the
concept. You might suggest that these concepts have only a peripheral
role in timed text, I would suggest that they have an incredibly
valuable role in subtitling and accessibility. 

	 

	[GA] Your goal will be perfectly supported by AFXP by means of a
combination of metadata (for which we will depend on you to create an
additional standard or profile regarding usage thereof) and the AFXP
applicative styling mechanism, which, using XPath, will allow you to
conditionalize application of style based on complex predicates
expressed in XPath that use both content and metadata to select stylable
content. 

	 

	The 'rules' for associating context and style need not be
applicative in the way that CSS implements selection based styling, they
can create a pre-determined hierarchy in the head of the document. The
context of text content cannot change after authoring - so it is similar
to the DFXP referential style concept. What might change is the users
requirement for how text associated with that context is presented (or
indeed if it is). By including more support for context - you can
achieve a more acceptable presentation of the document to a wider
audience, for example the inclusion of 'prosody' information might allow
better (re-)speaking of the content. Inclusion of 'role' allows
filtering... and so on.

	 

	[GA] We expect AFXP to support multiple internal or external
style sheets that contain the mapping; as a result, the user's
requirements to flexibly associate or change between associations of
content and its style can be effectively implemented. 

	 

	I guess I am disappointed that this is seen as optional - rather
than as fundamental to Timed Text in general.

	 

	It has been stated that:

	"The intent with DFXP is to have already made all conditional
selections prior to transmitting/exchanging in DFXP format."

	 

	This has important implications for TV subtitles. DFXP is
currently under consideration as a foundation for containing subtitles
within MXF / AAF media packages for use in TV and Digital Cinema. While
making selections prior to transmission or exchange is reasonable, it is
not so reasonable to make these selections prior to the storage of an
asset.

	 

	[GA] The process of asset storage and the policies applied there
is effectively outside the scope of TT AF in general. Nevertheless, you
may wish to consider use of AFXP as a potential storage asset that can
be subjected to dynamic, even real-time, mappings to DFXP for either
direct delivery or subsequent transcoding to a legacy distribution
format.

	 

	This is because the circumstances affecting the selection may
change between the storage of the asset and its subsequent transmission.
In effect this DFXP constraint implies that using 'pure' DFXP as the
storage format would require that all possible outcomes of the selection
process be stored as separate DFXP files within the asset package - e.g.
a file for each language - plus a file for each conditional content
switch (e.g. caption/subtitle, pre-watershed/post watershed). This is
sub-optimal.

	 

	[GA] It is not sub-optimal from the perspective of the goals of
DFXP or the simplicity of its format and processing models. You are
asking to expand the scope of DFXP. 

	 

	Conditional content could be implemented using text context and
associated styling.

	 

	It should be noted that CEA-608/708, and WST (and in fact TV
subtitling formats in general) are typically not stored in these wire
formats by broadcasters, rather these wire distribution formats are
created in real-time by insertion equipment working from proprietary
file formats. A single common file format already exists as a ratified
interchange standard, EBU 3264. DFXP could replace the use of EBU 3264 -
it offers a few of advantages, a) it is Unicode, b) it is XML and c) it
has a more comprehensive language tagging mechanism. However, DFXP does
not offer any significant new features over EBU 3264, and indeed there
are features in EBU3264 that are not present in DFXP (e.g. cumulative
mode and boxing).
	
	[GA] I'm not sure what you mean by "cumulative mode" or
"boxing", so I can't say whether these are supported in DFXP or not.

	[JB]Cumulative mode rests upon the concept of a 'cursor
position' - such that subsequent text can be appended to text already in
view. DFXP can emulate the output of a cumulative subtitle file, but
does not necessarily capture the fact that fragments of text form a
complete subtitle (except indirectly by virtue of the fact that they
share a common end time).

	 

	[GA] Based on your explanation, I believe that DFXP does support
cumulative mode, although the details of this support have (which will
appear in Annex B. In particular, I expect that by means of temporally
activating content that is appended to the current content of a region
undergoing dynamic flow, e.g., by appending a <span/>, a <par/>, a
<div/> into a region, then the newly activated content can participate
in the content available for flowing into the region.

	 

	For example, one might have the following scenario:

	 

	<region id="r1">
	  <style tts:overflow="dynamic"/>
	  <style tts:dynamicFlow="in(word) out(line)"/>
	</region>
	...

	<p region="r1" begin="10s" dur="10s">
	  <span>Some Text</span>
	  <span>Some More Text</span>
	</p>

	 

	If the entire content of this paragraph is available for dynamic
flow, then it is as if no cumulative mode applies. However, say that you
have chosen a fragment based streaming representation of this document's
infoset, e.g., by using MPEG-7 Part 1 BiM or equivalent. In this case,
you might have three fragments that you transmit:

	 

	F0 - contains <p> start tag, but no children

	F1 - contains 1st <span/> and its character items

	F2 - contains 2nd <span/> and its character items

	 

	In this scenario, streaming decoder could start the dynamic flow
on the region based upon the arrival of <p>, and then append the <span/>
content, for being dynamically flowed.

	 

	Conversion between a cumulative mode subtitle file, and a non
cumulative mode file represented by DFXP is thus made more difficult -
since the grouping of fragments is lost. You could adopt a 'convention'
where a <P> element always contains a complete subtitle - but this is
then mixing two concepts together, reducing the usefulness of the <P>
element. This is because conversion between presentations that allow
different numbers of displayed lines and characters requires a
distinction between logical text boundaries (paragraphs) and the
arbitrary boundaries imposed on the text by the limitations of the
subtitles mechanism. So conversion between 2 row line 21 captions and 3
row Teletext captions should use <p> as a logical division in the text -
when reformating 2 row subtitles into a 3 row format.

	 

	Put another way - cumulative mode is a 'cooked' way of pacing
the display of text to the user.

	 

	Boxing is the issue of background colour only behind glyphs, not
for the whole region (see my earlier email (sent Wed 16/03/2005 17:36)
regarding extending the values for the show-background attribute).

	 

	[GA] DFXP already supports separation of background color for
content separately from background color for region. In fact, there are,
at present, five distinct background colors that may apply, which, when
using alpha components and opacity, may result in a total of five
degrees of background layering. Those five are: region, body, div, p,
and span.

	 

	It is not yet clear that DFXP does not support the effects you
have asked for in your proposed extension for showBackground. The TTWG
will be discussing this matter more at our upcoming F2F to determine if
there are features we want to add in DFXP to support more complex
background painting scenarios. 

	
	A combination of extension elements and attributes and
constrained document structuring (via a sub-profile) can probably be
used with DFXP to fully represent EBU 3264 document contents - and other
general TV broadcast related subtitling issues. Indeed, it is
anticipated that the use of DFXP as an interchange mechanism for TV
broadcast subtitling will require the development of guidelines for the
interpretation of DFXP documents by transcoders. In addition it will
probably require the development of a profile to add elements and
attributes to DFXP to carry information and features currently supported
by existing formats, (e.g. conditional content, cumulative modes,
background styles, embedded glyphs, subtitles as images (DVD, DVB,
Imitext)).
	
	The pressing need is not IMO for another interchange format per
se, rather it is for a format that preserves more of the authorial
intent (inc. understanding / meaning) such that implementing
transcoding, translation and accessibility are made easier tasks than
they are currently. My main concerns are that using DFXP will encourage
the continuation of the existing practice of 'cooked text content' -
that is text that has lost contextual meaning - and that AFXP will be
too complex and too late for most implementations.
	
	Is there a middle path for DFXP that would encourage a more
context sensitive (and accessible) role for text style? DFXP already
includes a referenced style mechanism - could that mechanism be
strengthened to provide greater support for contextual styling of text?
	
	[GA] You are asking to expand the scope of DFXP from its express
role as a useful subset for interchange among existing legacy formats to
a role approaching AFXP. In other words, you are effectively asking the
TT WF to drop its work on a subset that could serve an immediate purpose
and be a stepping stone to a more general solution. I can't imagine the
TT WG changing its course on this point, but we will discuss your
comments and respond formally with a consensus position.

	I fully understand the position of the TTWG, however, I have
strong reservations as to how effective DFXP is as a stepping stone to
AFXP - when DFXP essentially bypasses most of the 'harder' problems,
that I hope AFXP will address, and leaves no obvious placeholders for
them to fit into.

	 

	[GA] DFXP is a stepping stone in the sense that it is a proper
subset of AFXP (as intended) and that it entails compiling information
that may be richer in AFXP into a flatter structure in DFXP. 

	 

	I don't want TTWG to drop the work on DFXP - far from it - but I
am uncertain as to the the larger role for a format that provides the
same level of functionality as the existing legacy formats - but
includes few features that support and extend the concept of universal
content.

	 

	[GA] Think of it as an essesntial step in the W3C
standardization process. It enables the TT WG to show real progress,
that, at least according to the sentiments of the member of the WG, has
a real and concrete value. 

	 

	I would be delighted however if DFXP showed a turn away from the
markedly 'cooked' approach it has (to style in particular). 

	 

	[GA] It will not. 

	 

	IMO DFXP is currently in short - far too presentation centric.

	 

	[GA] It was designed that way.

	 

	DFXP would IMO be considerably more useful if it explicitly
provided more support for 'soft' styling of the text content (and
promoted the concept).

	 

	[GA] Use metadata in combination with a complex transform of
your design.

	 

	I believe that DFXP will be adequate to interchange the current
web based formats, and with some tweaks (by profile or convention or
both) will be able to interchange TV broadcast subtitle files. In that
respect DFXP has met its goals.

	 

	[GA] DFXP was not designed for interchange amongst web based
formats per se, but among SAMI, QT, RT, 3GPP TT, 608/708, and to some
extent, WST, none of which are particularly designed as web based
formats.  The additions you are asking for, from my understanding, go
considerably beyond "some tweaks". Please feel free to develop profiles
or standardized conventions on top of DFXP.

	 

	Finally - most of these concepts that I am alluding to are not
present in any existing legacy formats, I wish that they were.

	 

	[GA] Which is precisely why they have been given less priority.
While the task of defining support for new features is very interesting,
it takes considerable time and effort, and requires direct and constant
representation by the parties that advocate the features. Ultimately,
the results of the TT WG are based upon what its members are willing to
invest in time and money. The membership is always open to new
participants that advocate their specific interests.

	 

	 Subtitle files formats typically are cooked - the text smashed
into arbitrary units (subtitles) with hard styling applied. It is my
frustration at dealing with the conversion of these files between
systems/formats that has prompted my 'crusade' for more abstraction
within DFXP/AFXP.

	 

	[GA] Please focus the attention of your crusade upon AFXP, and
please join the TT WG if you want to see your goals implemented.

	 

	I guess I am just disappointed that DFXP is unlikely to make
these issues any easier and concerned that standard (non-profiled) DFXP
will perpetuate the problem by becoming adopted as yet another cooked
format.

	 

	[GA] DFXP was designed explicitly to be a cooked format to
facilitate interchange amongst other cooked formats. This is precisely
what the TT WG was chartered to accomplish. As the chairman of the WG, I
have to insist that the group focus its very limited resources on
accomplishing only and no more than our chartered work, and do so in a
very expeditious process that emphasizes results over the design of new
features and new technology, however, nice it would be to accomplish.

	 

	
	best regards
	John Birch.

Received on Monday, 4 April 2005 20:46:24 UTC