W3C home > Mailing lists > Public > public-html-a11y@w3.org > July 2010

Re: [media] Addressing "Captioning" feedback on requirements document

From: Silvia Pfeiffer <silviapfeiffer1@gmail.com>
Date: Sat, 24 Jul 2010 16:38:20 +1000
Message-ID: <AANLkTimrNDgdLghhMzREfW-t_pkwswJg08ofk4onu=fY@mail.gmail.com>
To: HTML Accessibility Task Force <public-html-a11y@w3.org>
Note to all:

Changes to the Captions section [1]
according to feedback from the survey [2]
and discussions of that feedback between Sean and myself have been
finalised in the wiki [1].

Some notes on technical realisation have also been included, similar
to other discussions in the group on other media requirements areas.

To see the changes, go to :



Best Regards,

On Thu, Jul 1, 2010 at 12:46 AM, Silvia Pfeiffer

> Hi Sean,
> I will now only address those points where we haven't converged yet. I
> won't be able to make tomorrow's phone call, so if the discussions get
> to captions, you will have my input and things may be able to be
> resolved quickly.
> On Mon, Jun 28, 2010 at 9:18 PM, Sean Hayes <Sean.Hayes@microsoft.com>
> wrote:
> > 1. Agree (although I would rather say "video without audio but with a
> text track is possible", as captions are a replacement for audio, this is a
> recurring theme we need to address once in a glossary.)
> agreed.
> > 2. CC-5 is for positioning of regions of text, e.g. to disambiguate
> multiple speakers or to avoid some information in the underlying media.
> Therefore the min requirement is a bounding box (with an optional
> background) into which text is flowed, and that probably needs to be pixel
> aligned. The absolute position of text within the region is less critical,
> although it is important to be able to avoid bad word-breaks and have
> adequate white space around letters and so on.
> >
> > CC-2 erasures means periods when there is no overlay information (no
> text, and no text background)
> Ah, I must have got confused between CC-2 and CC-5. Thanks for explaining
> CC-2.
> Feedback on CC-5: Agree.
> I further think with "all parts of the screen" we mean either the
> bounding box of the video or a randomly provided bounding box on the
> web page into which the caption cues are rendered. I think by default
> the caption format should provide a min-width/min-height for its
> bounding box, which typically is calculated from the bottom of the
> video box, but can be placed elsewhere by the Web page, with the Web
> page being able to make that box larger and scale the text relatively,
> too. The positions inside the box should IMHO be into regions, such as
> top, right, bottom, left, center.
> > 4. CC-14. Paint-on can be used to change text within an existing caption
> which is pop-on. Some examples would be a good idea tho.
> Agreed.
> > 5. CC-17  Multiple files might be used in the case where complete
> alternative captions for hearing and subtitles for language need to be used
> simultaneously (common in Europe and Asia). It would be possible to include
> these in a single file, but that makes the maintenance of those resources
> much harder. In some cases the inclusion of a few foreign words form part of
> the original soundtrack, and thus need to be in the same caption resource.
> OK, I think this explanation is good and should replace the current
> requirement text.
> > 6. CC-20. Italics may sufficient for a human, but it is important to be
> able to mark up languages so that the text can be rendered correctly, since
> the same Unicode can be shared between languages and rendered differently in
> different contexts. This is mainly an I18n issue. It is also important for
> audio rendering, to get pronunciation correct.
> If by markup we mean things like <ruby> and to specify what language
> it is, then yes, we need markup. But I don't think we need semantic
> markup.
> > 8. Agree, but it would be good to have a note somewhere explaining the
> differences between strict captioning, and more general text overlays.
> OK. However, we need to also consider that there could be captions in
> different languages for a piece of video, satisfying e.g. foreign
> hearing-impaired viewers.
> > 11. I think simultaneous presentation is implied by CC17. And necessary
> (see above).
> I think that realistically having more than 2 languages present at the
> same time is the limit. But also, I still think that CC17 is more
> about having the translations available than about how to present
> them. It may be work making 2 requirements out of this.
> > 20. Make the role of timebase generic to the media (indeed in MPEG the
> time base is not strictly part of the audio or the video but a separate
> entity). Include distinction between caption and other forms of text in
> glossary
> Agreed - have a time keeper for the resource rather than the
> individual tracks - and it could be a separate timing track in the
> media resource anyway.
> Thanks,
> Silvia.
Received on Saturday, 24 July 2010 06:39:12 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 19:55:41 UTC