- From: Silvia Pfeiffer <silviapfeiffer1@gmail.com>
- Date: Wed, 18 Sep 2013 17:41:22 +1000
- To: Cyril Concolato <cyril.concolato@telecom-paristech.fr>
- Cc: public-html <public-html@w3.org>
On Wed, Sep 18, 2013 at 12:21 AM, Cyril Concolato <cyril.concolato@telecom-paristech.fr> wrote: > Hi Silvia, > > Le 16/09/2013 05:26, Silvia Pfeiffer a écrit : > >> On Thu, Sep 12, 2013 at 1:50 AM, Cyril Concolato >> <cyril.concolato@telecom-paristech.fr> wrote: >> >> Can you go through all of these and make a list of the types under >> question and where they fit into one of the semantic @kind values that >> the HTML spec has? The list athttp://mp4ra.org/codecs.html seems huge >> >> and not cover all the types you're mentioning. > > It's not so big once you remove Audio/Video/Hint handler types, the > remaining stream types would be: > - ISO stuff: Text timed metadata, XML timed metadata, URI identified > metadata, MPEG-4 Systems streams, SVC metadata, text streams > - DVB stuff: Track Level Index Track, Movie level index track, > - 3GPP/OMA: 3GPP Timed Text, OMA Keys, > - DECE Sub-titles (Timed Text), > - Apple 32/64 bit timecode samples Sorry if this seems obvious to you, but which of these are covered by TextTrack @kind values? I.e. which of these are captions / subtitles and which are something else (i.e. "metadata")? >> Also, a nit-pick: I am confused why WebVTT is regarded as "Textual >> meta-data with MIME type" when it's just generally timed-aligned bits >> of data? > > The ISO spec is a quite confusing here and maybe the MP4RA site too. There > are 2 parameters to consider: > - the *handler type* (3rd column in the MP4RA site) that classifies the > content in large categories, to inform the player about the broad > capabilities it needs to have to process the stream, and which can have the > following 4CC values (i.e. ability to process) : 'soun' (sound), 'vide' > (video), 'subt' (subtitles potentially with images), 'text' (subtitles > without images), 'hint' (transport protocol packets) or 'meta' (metadata). > - and the stream type (or *sample entry type*, 1st column) also identified > by a 4CC. > > Unfortunately, there is some overlap in the handler types between 'subt', > 'meta' and 'text'. I lost the battle proposing to harmonize them. So here > are some examples of interest (using <handler type>/<stream > type>/<additional parameters when the stream type is too generic>): > - WebVTT is identified as 'text'/'wvtt' > - TTML is identified as 'subt'/'stpp' > - 3GPP Timed Text is identified as 'text'/'tx3g' > - a generic XML metadata stream would be: 'meta'/'metx'/<namespace> > - a generic text metadata stream would be: 'meta'/'mett'/<mime format> > > As for the one you mention "Textual meta-data with MIME type" it is > identified as 'meta'/'text'/<mime format> and I can't find what it is used > for... Thanks for this. This should be useful to identify semantics. >>> - then, if the couple ISO-parser/Browser is capable of producing an >>> equivalent WebVTT representation of the text track content (of any @kind, >>> possibly metadata) without losing information, the >>> @inBandMetadataTrackDispatchType is left empty and the track is populated >>> as >>> if it was an out-of-band WebVTT track. This would be used for example >>> when >>> WebVTT content is carried in ISO tracks but could be used for other >>> formats >>> where the mapping to WebVTT is feasible/simple. Note we could add a >>> similar >>> text for TTML once the TTML cues are defined. >> >> Note the above mentioned distinction between the currently proposed >> UnparsedCue and VTTCue - this should be taken care of here, too. >> >> So, first you need to check if the format in cues is natively >> supported in the browser and use that TextTrackCue sub-interface for >> the cues. >> (e.g. if TTMLCue is supported in the browser, expose it as TTMLCue) >> >> Only if it's not supported and it's not semantically @kind=metadata, >> suggest converting it to WebVTT. > > Agree. > > >> >> >>> - and otherwise (if a WebVTT representation cannot be generated or >>> generated >>> without loss), >>> - the TextTrack object is populated as follows: >>> - the @kind is set to 'metadata' >>> - the @label is set to the ISO 'track handler name' >>> - the @id is set to the ISO track id >>> - the @inBandMetadataTrackDispatchType contains the base64 encoded >>> sample entry box. >>> - and each sample produces a cue built as follows: >>> - the id attribute is empty >>> - the pauseOnExit attribute is set to false >>> - the start and end time of the cue are the start and end time of >>> the >>> sample. >>> - the content of the cue contains the sample data. Note: the cue >>> content can be in .text (base64 encoded if initially binary) or if the >>> cue >>> interface (TextTrackCue, VTTCue or UnParsedCue or whatever the name) >>> includes an ArrayBuffer, we should use that. >> >> That makes sense to me with UnparsedCue as the interface. > > Ok, I'll make sure this is integrated when the interface finally shows up. Good. Glad to hear we're on the same page now. Silvia.
Received on Wednesday, 18 September 2013 07:42:10 UTC