Re: [media] Addressing "Captioning" feedback on requirements document

Hi Sean,

I will now only address those points where we haven't converged yet. I
won't be able to make tomorrow's phone call, so if the discussions get
to captions, you will have my input and things may be able to be
resolved quickly.



On Mon, Jun 28, 2010 at 9:18 PM, Sean Hayes <Sean.Hayes@microsoft.com> wrote:
> 1. Agree (although I would rather say "video without audio but with a text track is possible", as captions are a replacement for audio, this is a recurring theme we need to address once in a glossary.)

agreed.


> 2. CC-5 is for positioning of regions of text, e.g. to disambiguate multiple speakers or to avoid some information in the underlying media. Therefore the min requirement is a bounding box (with an optional background) into which text is flowed, and that probably needs to be pixel aligned. The absolute position of text within the region is less critical, although it is important to be able to avoid bad word-breaks and have adequate white space around letters and so on.
>
> CC-2 erasures means periods when there is no overlay information (no text, and no text background)

Ah, I must have got confused between CC-2 and CC-5. Thanks for explaining CC-2.

Feedback on CC-5: Agree.
I further think with "all parts of the screen" we mean either the
bounding box of the video or a randomly provided bounding box on the
web page into which the caption cues are rendered. I think by default
the caption format should provide a min-width/min-height for its
bounding box, which typically is calculated from the bottom of the
video box, but can be placed elsewhere by the Web page, with the Web
page being able to make that box larger and scale the text relatively,
too. The positions inside the box should IMHO be into regions, such as
top, right, bottom, left, center.


> 4. CC-14. Paint-on can be used to change text within an existing caption which is pop-on. Some examples would be a good idea tho.

Agreed.


> 5. CC-17  Multiple files might be used in the case where complete alternative captions for hearing and subtitles for language need to be used simultaneously (common in Europe and Asia). It would be possible to include these in a single file, but that makes the maintenance of those resources much harder. In some cases the inclusion of a few foreign words form part of the original soundtrack, and thus need to be in the same caption resource.

OK, I think this explanation is good and should replace the current
requirement text.


> 6. CC-20. Italics may sufficient for a human, but it is important to be able to mark up languages so that the text can be rendered correctly, since the same Unicode can be shared between languages and rendered differently in different contexts. This is mainly an I18n issue. It is also important for audio rendering, to get pronunciation correct.

If by markup we mean things like <ruby> and to specify what language
it is, then yes, we need markup. But I don't think we need semantic
markup.


> 8. Agree, but it would be good to have a note somewhere explaining the differences between strict captioning, and more general text overlays.

OK. However, we need to also consider that there could be captions in
different languages for a piece of video, satisfying e.g. foreign
hearing-impaired viewers.


> 11. I think simultaneous presentation is implied by CC17. And necessary (see above).

I think that realistically having more than 2 languages present at the
same time is the limit. But also, I still think that CC17 is more
about having the translations available than about how to present
them. It may be work making 2 requirements out of this.


> 20. Make the role of timebase generic to the media (indeed in MPEG the time base is not strictly part of the audio or the video but a separate entity). Include distinction between caption and other forms of text in glossary

Agreed - have a time keeper for the resource rather than the
individual tracks - and it could be a separate timing track in the
media resource anyway.


Thanks,
Silvia.

Received on Wednesday, 30 June 2010 14:47:01 UTC