More comments on updated timed-text document from Chris Lilley on 2002-03-13 (www-tt-tf@w3.org from March 2002)

From: Chris Lilley <chris@w3.org>
Date: Wed, 13 Mar 2002 14:50:04 +0100
To: geoff freed <geoff_freed@wgbh.org>
CC: www-tt-tf@w3.org, w3c-wai-pf@w3.org
Message-ID: <11145542468.20020313145004@w3.org>

On Monday, 11 March, 2002, 16:45:23, geoff wrote:

gf> The timed-text requirements document has been updated; the new
gf> version is available at

gf> http://www.w3.org/AudioVideo/timetext.html

The grouping in terms of Display, Timing, and Architecture is a good
one but should Architecture not be presented first?

Also, points I.2 and I.3 (see my previous message) belong more clearly
under architecture than Display.

I.4 is open to misinterpretation as there are special Unicode
characters that function as bidi overrides. I suspect that this point
is not calling for their use (markup, instead, should be used) but
rather, saying that the display of bidirectional text must be
supported. Which, for a single level of embedding, such as some rtl
characters in the middle ofd a string of ltr characters, is
accomplished merely by the existence of such characters. Explicit
markup is only required to support multiple levels of embedding (ltr
string nested inside rtl string nested inside ltr string). Is that
correct?

I.6 Expresses both a requirement and a solution, but presents it as an
example implying that there are a multitude of possible ways to do the
same thing. Thats good in a requirements document, but might be
mis-read as 'there will be multiple ways/no defined way in the ensuing
specification'.

Thus, if it is already agreed that the format will use XML, I suggest
rewording as "Allow the language of the text to be identified and text
in different languages to be appropriately styled, using xml:lang".
This is because xml:lang is already defined in the XML 1.0
specification and is well understood; and because having a single way
to identify language increases interoperability, and because CSS has
specific selectors for the language of text and thus, for example,
color coding or font style changes or whatever can easily be used to
denote language in the presentation.

This is another requirement that might be better moved to
Architecture. In that case, it could be split - this part in
Architecture:
"Allow the language of the text to be identified using xml:lang"
this in Display:
"Allow text in different languages to be appropriately styled"

I.10 can be read in several ways.

a) small graphics can be mixed in with the text, the author having
complete control over the content and form of the graphic and the set
being open ended.

b) a defined set of graphical symbols can be mixed in with the text.
The appearance of these symbols is defined precisely in the
specification

c) a defined set of graphical symbols can be mixed in with the text.
The general appearance of these symbols is defined in the
specification, the exact appearance being implementation dependent.

d) a small, defined set of characters not in Unicode will be
supported; their appearance depends on the fonts used to display them
which must be provided on a case by case basis.

e) a small, defined set of characters not in Unicode will be supported
by defining positions in the private use area; their appearance depends
on the fonts used to display them.

It was not clear to me what I.11 means - does this refer to user
preferences as expressed, for example, through a CSS user stylesheet?

I understand what point II.2 is getting at, but speaking of 'erasure'
is problematic unless it is assumed that all text will be displayed
against its own rectangular, opaque background. Rather than thinking
of text overwriting other text, I suggest thinking more along the SMIL
model that elements have a duration. Between their start and end
times, they are visible. Thus, having an old caption disappear and no
new caption displayed simply arises as a natural consequence of the
architecture. Thuis would help simplify and define buth II.1 and II.2

II.3 and II.4 could be read as contradicting one another on first
reading. I suspect that II.3 means that in the markup, the text and
its associated timing should be closely associated so that content can
be easily re-purposed and easily understood. I suspect that II.4 means
that in the specification, the markup for text and the markup for
timing will be defined in separate modules, for example to allow a
non-timed text format or a timed-something-else format to be developed
by others, re-using these modules.

Point III.6 seems to call for indexability, in other words to be able
to start a timed presentation at any point in the timeline (seek to a
point in the timeline; start subtitles half way through a film). I
agree, and note that SMIL can do this.

III.7 seems to call for multiple language text only one branch of
which is displayed (such as the users preferred language) and II.8 for
multiple language text all of which is presented, such as a quotation
in one language in the middle of text of a different language. I agre
with both requirements, but III.8 seems to follow as a natural
consequence of I.2,I.3 and I.6. This might be more apparent if, as
suggested, those three are moved to section III Architecture.

In II.9, assuming this will not be an SGML format, the HTML a element
will not be directly possible. I suggest reference be made to the XML
possibilities only - the XHTML a element, the SVG a element, generic
XLink attributes on a timed-text element, etc.

Its not clear to me what III.10 would mean in practice.

II.13 suggests two methods, one of which is purely presentational, the
other of which is more 'semantic' in that it creates named actors or
roles. This seems to me to be far preferable. It gives more flexibility
in styling and user choice (see I.11) and enhances re-use and aids
requirements such as II.14. (by having something clear in the markup
which tools such as XSL-T can use to create derived forms).

In I.16, of the three W3C Recommendations listed as possibilities for
"complex font displays", only SVG provides this capability. MathML
describes equations, but does not provide any font display mechanism
and XHTML does not provide any display mechanism at all. This requirenmet could do with clarification as to what it actually means:

a) Display of equations must be supported

b) There is a need for fonts with complex glyphs such as mathematical
symbols, multilingual characters, closed-captioning specific symbols,
etc.

c) something else that I didn't understand ....

II.19 is a good requirement. I suggest adding other W3C
Recommendations that will be used as a basis, such as XML 1.0, CSS2,
SVG 1.0 and so on (although, depending on the timescale envisaged, SVG
1.1 might be more appropriate particularly given the recent 3GPP
decisions to make SVG Tiny a mandatory part of MMS, SVG Basic an
optional part, to use SMIL Basic, etc).

As already noted by Dave Singer, IV.1 contradicts I.9 and perhaps I.12
and I.14. In addition, given III.19 it seems like an unnecessary
restriction, as SMIL already gives a means to give motion to text, and
SVG uses such facility extensively.

A very good requirements document, Geoff, it was a pleasure to review it.

Question - what is the visibility of this document, is it member-only
or public?

--
Chris mailto:chris@w3.org

Received on Wednesday, 13 March 2002 08:51:51 UTC