Sourcing language information for media tracks

Hi,
I've been working on adding media track support in Chromium and was
trying to use  the spec at
https://dev.w3.org/html5/html-sourcing-inband-tracks/ for reading
various media track properties.
I've ran into a few issues (see the discussion at
https://codereview.chromium.org/1735003004/ for details):

1. The current draft says that for MPEG-4 ISOBMFF files the language
for audio/video tracks should be: "Content of the language field in
the MediaHeaderBox."
But according to various sources (e.g.
https://developer.apple.com/library/mac/documentation/QuickTime/QTFF/QTFFChap2/qtff2.html,
https://www.scribd.com/doc/10911307/26/mdhd-box, etc) the language
field in MDHD box is a zero-padded 16-bit value that encodes a
3-letter ISO 639-2/T language code:
3-character code specifying language(see ISO 639-2/T); each character
isinterpreted as 0x60 + (5 bit) code to yield an ASCII character.
Yet according to the HTML5 spec (see
https://html.spec.whatwg.org/multipage/embedded-content.html#dom-audiotrack-language)
A/V track language attributes must return BCP-47 language tag strings,
not the 3-character ISO 639-2/T codes.
So the inband-tracks-sourcing spec should probably mention that the
language read from MDHD.language field needs to be translated into
BCP-47.

2. Again, according to multiple sources on the internet (for example
see https://developer.apple.com/library/mac/documentation/QuickTime/QTFF/QTFFChap2/qtff2.html).
There is a new way to specify language info for media tracks in MP4
files, see the 'Extended Language Tag Atom' section in the link above.
The ELNG box is optional, but when present it will have a BCP-47
language info. So when the ELNG box is present in the .mp4 file, it
should probably be preferred over the language field of MDHD box.

Received on Friday, 11 March 2016 07:47:48 UTC