Re: HTML 5, SMIL, Video from Silvia Pfeiffer on 2010-02-21 (public-html-a11y@w3.org from February 2010)

From: Silvia Pfeiffer <silviapfeiffer1@gmail.com>
Date: Sun, 21 Feb 2010 20:53:23 +1100
To: Dick Bulterman <Dick.Bulterman@cwi.nl>
Cc: John Foliot <jfoliot@stanford.edu>, Geoff Freed <geoff_freed@wgbh.org>, public-html-a11y@w3.org, markku.hakkinen@gmail.com, symm@w3.org
Message-ID: <2c0e02831002210153t23e6fc00j3a6e1a075ea6a640@mail.gmail.com>

Hi Dick,

Seems like I owe you a technical reply on your list of objections.

On Sat, Feb 20, 2010 at 7:27 PM, Dick Bulterman <Dick.Bulterman@cwi.nl> wrote:
> Here is my summary of the advantages in this example for smilText
> a) it is an XML format

Any external file format would need to be parsed and mapped into HTML
presentation. Being in xml is neither an advantage nor a disadvantage.

> b) it can be plugged directly into HTML with no new syntax

Support for the new tags still needs implementation in browsers. Also,
many of the elements would be better represented by existing HTML
markup. This is what cue ranges attempted to do and they will need to
be revived in one form or another. But they are more generic than just
text, so no advantage here either.

> c) it has an architecture that allows structuring, styling, motion and other
> forms of manipulation (roles, etc) to be added easily

So does DFXP. It is good that SRT doesn't have any of it, so existing
HTML markup can be used to do such manipulation. Such structure,
styling, motion and other forms of manipulation are necessary for
stand-alone video players, but not for Web browsers, where such
functionality is given. There is no intention to extend SRT to have
such features.

> d) it is less dense than SRT (in this example, more than 10%)

It's better when files are shorter than longer. That particular
example had 1000 times the precision than the smilText example and was
still 10% shorter. I'd say that is an advantage.

> e) it is less prone to error in hand-editing

That is not a fact, but an opinion. I would claim the opposite, which
is why I didn't want to get into this argument.

> f) because the ordering of content can be realive, the contents can
> be generated on-the-fly, with inserts possible

I assume you meant "relative". Since any format that is being written
live (on-the-fly) will need to queue future events and then write them
out as they are being entered, a sequence of srt annotations could
also be written in this way. Insertions are a little harder since in
theory you have to renumber all future captions. But that's not a hard
thing to do with a machine.

> g) the 'begin' attribute can also be used for event-based scheduling
> of text objects; since these also can have ID's, it can also hurl events to
> an outside context -- this has the nice ability for users to control the
> flow and tempo of content delivery. (A longer term benefit.)

It is possible to raise a event in HTML upon reaching every caption's
start time with SRT in the same way. However, thus far, the discussion
about raising of events related to captions has not gone very far. It
may well be that this discussion returns if somebody comes back with a
requirement. Right now, raising events related to media elements is
something that is being discussed wrt cue ranges.

OK, the discussion has thus far focused on SRT. Questioning SRT will
not bring SmilText into the spec.

I am still convinced that the combination of SRT and DFXP is a good
one and that SmilText is equivalent to a subpart of DFXP and therefore
not necessary. It's my opinion which I have built from several
discussions over the last year, but I am sure I'm not the only one
with an opinion here.

It would be better if other speak up, too, so an informed decision can be made.

Best Regards,
Silvia.

Received on Sunday, 21 February 2010 09:54:17 UTC