RE: XSL and CSS Re: Coments - last call draft from Sean Hayes on 2005-04-05 (public-tt@w3.org from April 2005)

From: Sean Hayes <shayes@microsoft.com>
Date: Tue, 5 Apr 2005 11:11:29 +0100
To: <Johnb@screen.subtitling.com>, <gadams@xfsi.com>
Cc: <public-tt@w3.org>, <charles@sidar.org>, <Alfred.S.Gilman@IEEE.org>
Message-ID: <2E8E7EA6DA6DF24F853D296CAB3BB31B02063492@EUR-MSG-11.europe.corp.microsoft.com>
To annotate the document does not imply or require the use of style or
timing in AFXP. These are options after the fact, we looked at providing
a mechanism for describing the semantics of captions, but as you and I
know documenting human communication is an open ended and ultimately
subjective activity. If we attempted (as some have) to try and solve
this, we would be tied up for years, thus it was ruled completely out of
scope for TT. What we decided to do instead was allow a simple mechanism
to include whatever semantic notation you wish to adopt, and then use
this in a practical fashion. I don't feel we can do much more, we added
a few simple tags to do simple things but a comprehensive solution, if
it is even possible, is the scope of another body.

 

The following document demonstrates how you might use TT to support an
in house process:

 

<tt xmlns:semantic=www.screen.subtitling.com/idealSemanticNotation
<http://www.screen.subtitling.com/idealSemanticNotation> >

            <head>

                        <meta >

                                    <semantic:context
id="JohnTalkingToW3CTT"> blah blah blah </ semantic:context >

                        </meta>

            </head>

            <body>

                        <div semantic:contextUse="JohnBTalkingToW3CTT ">

                                    <p> IMO applicative style (be it CSS
or XSL-FO/XSLT) is flawed. It requires a priori knowledge of the
arbitrary values chosen for class names and metadata attributes in any
style sheet or processor. Further it mixes two domains together, that of
style and that of context, leading to a dilution of the context aspect -
typically only sufficient context (in the form of class tags or
metadata) is applied to a document to meet the requirements of a
specific instance of a style processor, and when it is desired to
re-purpose the content based upon the context, it is often discovered
that insufficient context information exists in the original document.

                                    </p>

                        </div>

            </body>

</tt>

 

At this point we have not stepped outside the purely semantic realm.

 

If we now want to apply timing or styling to this, we can use the
contextUse attribute to apply either a visual style or a temporal
behaviour based on the semantic context as I outlined before.

 

As to the last point, the simple answer is convenience and practicality.
You might with the same justification not define any standard XML syntax
for anything, since XSLT is a completely general mechanism, but what you
suggest would imply processing the XML through XSLT at every timecode
change - the amount of processing power required would be enormous, and
the XSLT horrendous to write.

 

________________________________

From: public-tt-request@w3.org [mailto:public-tt-request@w3.org] On
Behalf Of Johnb@screen.subtitling.com
Sent: 05 April 2005 02:22
To: Sean Hayes; Johnb@screen.subtitling.com; gadams@xfsi.com
Cc: public-tt@w3.org; charles@sidar.org; Alfred.S.Gilman@IEEE.org
Subject: RE: XSL and CSS Re: Coments - last call draft

 

Sean,

 

It is clear that DFXP will not satisy my desires for a more semantic
text format.

 

However, using applicative style in AFXP does still not get at the 'nub'
of this issue.

Applicative style is still 'presentation' centric. You need to move away
from thinking about what the text will look like when displayed, and
think about what the text is. It is this semantic, and a mechanism for
storing it in a standard manner, that I wish to see in a timed text
standard. The manner in which the text is displayed is totally secondary
to the reason for the text. So I envisage a timed text document that
contained timing (which could crudely simulate prosody), text codepoints
and text context with no style whatsoever. Style could then be applied
by a UA or transcoding process (choose your favourite tool here) into
DFXP or any other format. Should there be a requirement within a format
to force style upon the user / or through any transcoding process, (e.g.
for trademark or colour matching purposes), the style could be cooked
into the format using inline or referential style mechanisms (or some
X-Path based scheme).

 

IMO applicative style (be it CSS or XSL-FO/XSLT) is flawed. It requires
a priori knowledge of the arbitrary values chosen for class names and
metadata attributes in any style sheet or processor. Further it mixes
two domains together, that of style and that of context, leading to a
dilution of the context aspect - typically only sufficient context (in
the form of class tags or metadata) is applied to a document to meet the
requirements of a specific instance of a style processor, and when it is
desired to re-purpose the content based upon the context, it is often
discovered that insufficient context information exists in the original
document.

 

I appreciate that applicative styling can be used in the way you
describe, and if it is all that is available then it would be used. But
it is not targeting the issue precisely. Like the cumulative style
issue, your suggestion achieves the correct display result for a
specific example, but does not fully address my point. Perhaps it is
still not completely clear what I am getting at. Do you recall the
discussions we had about temporal flow (display flow). Initially there
were many examples given by the TTWG of how similar results could be
achieved using elaborate timing hierarchies, but the use of a few
attributes as hints gets to the heart of the issue of how to flow text
through a small display region over a period of time. What I am talking
about is a similar thing for style. Instead of thinking in terms of how
an XML tree can be 'decorated' with tags which can then be used to force
the display desired by a user, think about the reasons the user wants to
decorate the text and provide a strong mechanism to capture those
reasons instead.

 

BTW, why bother to put applicative styling inside AFXP, when it could be
achieved by external XSLT processing?

 

regards 

John Birch.

	-----Original Message-----
	From: Sean Hayes [mailto:shayes@microsoft.com]
	Sent: 04 April 2005 18:46
	To: Johnb@screen.subtitling.com; gadams@xfsi.com
	Cc: public-tt@w3.org; charles@sidar.org;
Alfred.S.Gilman@IEEE.org
	Subject: RE: XSL and CSS Re: Coments - last call draft

	It seems fairly clear that DFXP is not going to be useful for
John, DFXP was designed as a very specific subset of TT for a very
specific purpose and not necessarily as a 'stepping stone' to AFXP and
thus I don't wish to labor that point here, although I also would like
more of the AFXP features in DFXP. However if AFXP is not suitable for
Johns requirements then I think we have more of a problem.

	 

	The use of applicative style/timing however in AFXP does I
believe achieve exactly what I you are asking for.

	 

	In the scenario that you want warnings in your titles, you would
annotate them either using ttm:role='x-warning' (or an attribute in some
other namespace or elements in a <meta> block). Since AFXP styling uses
fairly general XPaths, you can then apply a style like so:

	 

	<style select="//*[@ttm:role='x-warning']",
tts:background-color='red' tts:color='yellow'.../>

	 

	With conditional inclusion of stylesheets, or conditions within
stylesheets, you can then make warnings market specific.

	 

	Similarly timing can be applied in exactly the same fashion, so
that content can appear (or not), move around etc. based on the
attributes you provide.

	 

	This mechanism is very flexible, general and fundamental to AFXP
TT, and not just a mechanism for hiding or reducing definitions.

	 

	
________________________________


	From: public-tt-request@w3.org [mailto:public-tt-request@w3.org]
On Behalf Of Johnb@screen.subtitling.com
	Sent: 04 April 2005 10:09
	To: gadams@xfsi.com
	Cc: public-tt@w3.org; charles@sidar.org;
Alfred.S.Gilman@IEEE.org
	Subject: RE: XSL and CSS Re: Coments - last call draft

	 

	Glenn
	
	See inline below.

		I wrote:

		The DFXP style model is quite suitable for the carriage
of styled text, BUT, in the contexts of accessibilty and transcoding,
the DFXP style mechanism IMO lacks an essential ingredient, that being
the reason for (or context of) the applied style.
		
		As an example - an author may choose yellow text on a
red background for a warning message.
		
		The carriage of that text as simply text characters and
colour codes loses one piece of information - the fact that it was
intended as a warning.
		
		[GA] It is trivial to include arbitrary user-defined
metadata in DFXP. One can also use user-defined values for the ttm:role
attribute. In both cases, you have a means to express and interchange
additional intentionality. It is far from clear what additional
standardization in this area may be warranted in DFXP at this time.

		[JB]The key words here are 'arbitrary' and
'user-defined', rather than standardised (as in CEA 708 - 8.5.9 Caption
Text Function Tags). I had hoped that DFXP would include a formalisation
of a text context mechanism that could be associated with presentation.
Note I am not suggesting that it is in scope to define a fully inclusive
set of attribute values to define all possible text contexts (although
some could be included), but I do think it in scope to define a
mechanism for effectively associating arbitrarily complex contexts with
text - the namespace mechaism can handle the issue of defining the
context. Similarly a mechanism could also be formalised that allows the
association of context with style.

		 

		I would also note that you can use a naming convention
in DFXP to express context, e.g.,
		<style id="warning" tts:color="red"/>
		...
		<span style="warning">Don't Panic!</span>

		Here - the choice of id value is arbitrary, and could
only be restricted by convention. As such, this is contrary to my
concept of a universal interchange format.

		 

		[GA] I'm not sure what you mean by "style tagging
(context)", so I cannot say if it would be a feature or not. What is
planned for AFXP is "applicative styling", which allows using a "select"
attribute on a style element or on a group of style element, where the
value of "select" is an XPath expression that selects content elements
to which  the styling is to be supplied. I'm not sure how this relates
to your phrase "style tagging".

		Not at all I fear :-( This sounds like just an another
mechanism for hiding or reducing the number of style definitions......

		 

		Sorry - on reflection "style tagging (context)" is a
rather woolly phrase....

		 

		What I am suggesting is a means of associating a context
with text content, and also associating that context with styling - such
that users of the document can associate a style with the text content
(be it for transcoding/translation or for display), where the style is
defined/influenced by the context (hierarchy). So axes on the context
'graph' might include 'role' and 'emotion' and 'prosody'. Note I am not
specifically proposing a mechanism here - just trying to describe the
concept. You might suggest that these concepts have only a peripheral
role in timed text, I would suggest that they have an incredibly
valuable role in subtitling and accessibility. 

		 

		The 'rules' for associating context and style need not
be applicative in the way that CSS implements selection based styling,
they can create a pre-determined hierarchy in the head of the document.
The context of text content cannot change after authoring - so it is
similar to the DFXP referential style concept. What might change is the
users requirement for how text associated with that context is presented
(or indeed if it is). By including more support for context - you can
achieve a more acceptable presentation of the document to a wider
audience, for example the inclusion of 'prosody' information might allow
better (re-)speaking of the content. Inclusion of 'role' allows
filtering... and so on.

		 

		I guess I am disappointed that this is seen as optional
- rather than as fundamental to Timed Text in general.

		 

		It has been stated that:

		"The intent with DFXP is to have already made all
conditional selections prior to transmitting/exchanging in DFXP format."

		 

		This has important implications for TV subtitles. DFXP
is currently under consideration as a foundation for containing
subtitles within MXF / AAF media packages for use in TV and Digital
Cinema. While making selections prior to transmission or exchange is
reasonable, it is not so reasonable to make these selections prior to
the storage of an asset. This is because the circumstances affecting the
selection may change between the storage of the asset and its subsequent
transmission. In effect this DFXP constraint implies that using 'pure'
DFXP as the storage format would require that all possible outcomes of
the selection process be stored as separate DFXP files within the asset
package - e.g. a file for each language - plus a file for each
conditional content switch (e.g. caption/subtitle, pre-watershed/post
watershed). This is sub-optimal.

		Conditional content could be implemented using text
context and associated styling.

		 

		It should be noted that CEA-608/708, and WST (and in
fact TV subtitling formats in general) are typically not stored in these
wire formats by broadcasters, rather these wire distribution formats are
created in real-time by insertion equipment working from proprietary
file formats. A single common file format already exists as a ratified
interchange standard, EBU 3264. DFXP could replace the use of EBU 3264 -
it offers a few of advantages, a) it is Unicode, b) it is XML and c) it
has a more comprehensive language tagging mechanism. However, DFXP does
not offer any significant new features over EBU 3264, and indeed there
are features in EBU3264 that are not present in DFXP (e.g. cumulative
mode and boxing).
		
		[GA] I'm not sure what you mean by "cumulative mode" or
"boxing", so I can't say whether these are supported in DFXP or not.

		[JB]Cumulative mode rests upon the concept of a 'cursor
position' - such that subsequent text can be appended to text already in
view. DFXP can emulate the output of a cumulative subtitle file, but
does not necessarily capture the fact that fragments of text form a
complete subtitle (except indirectly by virtue of the fact that they
share a common end time). Conversion between a cumulative mode subtitle
file, and a non cumulative mode file represented by DFXP is thus made
more difficult - since the grouping of fragments is lost. You could
adopt a 'convention' where a <P> element always contains a complete
subtitle - but this is then mixing two concepts together, reducing the
usefulness of the <P> element. This is because conversion between
presentations that allow different numbers of displayed lines and
characters requires a distinction between logical text boundaries
(paragraphs) and the arbitrary boundaries imposed on the text by the
limitations of the subtitles mechanism. So conversion between 2 row line
21 captions and 3 row Teletext captions should use <p> as a logical
division in the text - when reformating 2 row subtitles into a 3 row
format.

		 

		Put another way - cumulative mode is a 'cooked' way of
pacing the display of text to the user.

		 

		Boxing is the issue of background colour only behind
glyphs, not for the whole region (see my earlier email (sent Wed
16/03/2005 17:36) regarding extending the values for the show-background
attribute).

		
		A combination of extension elements and attributes and
constrained document structuring (via a sub-profile) can probably be
used with DFXP to fully represent EBU 3264 document contents - and other
general TV broadcast related subtitling issues. Indeed, it is
anticipated that the use of DFXP as an interchange mechanism for TV
broadcast subtitling will require the development of guidelines for the
interpretation of DFXP documents by transcoders. In addition it will
probably require the development of a profile to add elements and
attributes to DFXP to carry information and features currently supported
by existing formats, (e.g. conditional content, cumulative modes,
background styles, embedded glyphs, subtitles as images (DVD, DVB,
Imitext)).
		
		The pressing need is not IMO for another interchange
format per se, rather it is for a format that preserves more of the
authorial intent (inc. understanding / meaning) such that implementing
transcoding, translation and accessibility are made easier tasks than
they are currently. My main concerns are that using DFXP will encourage
the continuation of the existing practice of 'cooked text content' -
that is text that has lost contextual meaning - and that AFXP will be
too complex and too late for most implementations.
		
		Is there a middle path for DFXP that would encourage a
more context sensitive (and accessible) role for text style? DFXP
already includes a referenced style mechanism - could that mechanism be
strengthened to provide greater support for contextual styling of text?
		
		[GA] You are asking to expand the scope of DFXP from its
express role as a useful subset for interchange among existing legacy
formats to a role approaching AFXP. In other words, you are effectively
asking the TT WF to drop its work on a subset that could serve an
immediate purpose and be a stepping stone to a more general solution. I
can't imagine the TT WG changing its course on this point, but we will
discuss your comments and respond formally with a consensus position.

		I fully understand the position of the TTWG, however, I
have strong reservations as to how effective DFXP is as a stepping stone
to AFXP - when DFXP essentially bypasses most of the 'harder' problems,
that I hope AFXP will address, and leaves no obvious placeholders for
them to fit into.

		 

		I don't want TTWG to drop the work on DFXP - far from it
- but I am uncertain as to the the larger role for a format that
provides the same level of functionality as the existing legacy formats
- but includes few features that support and extend the concept of
universal content.

		 

		I would be delighted however if DFXP showed a turn away
from the markedly 'cooked' approach it has (to style in particular). 

		IMO DFXP is currently in short - far too presentation
centric. 

		DFXP would IMO be considerably more useful if it
explicitly provided more support for 'soft' styling of the text content
(and promoted the concept).

		 

		I believe that DFXP will be adequate to interchange the
current web based formats, and with some tweaks (by profile or
convention or both) will be able to interchange TV broadcast subtitle
files. In that respect DFXP has met its goals.

		 

		Finally - most of these concepts that I am alluding to
are not present in any existing legacy formats, I wish that they were.
Subtitle files formats typically are cooked - the text smashed into
arbitrary units (subtitles) with hard styling applied. It is my
frustration at dealing with the conversion of these files between
systems/formats that has prompted my 'crusade' for more abstraction
within DFXP/AFXP. I guess I am just disappointed that DFXP is unlikely
to make these issues any easier and concerned that standard
(non-profiled) DFXP will perpetuate the problem by becoming adopted as
yet another cooked format.

		
		best regards
		John Birch.
Received on Tuesday, 5 April 2005 10:11:34 UTC