Re: Liaison response - template on MIME type parameter for TimedText from Glenn Adams on 2014-05-11 (public-tt@w3.org from May 2014)

From: Glenn Adams <glenn@skynav.com>
Date: Sun, 11 May 2014 12:30:38 -0600
To: Cyril Concolato <cyril.concolato@telecom-paristech.fr>
Cc: TTWG <public-tt@w3.org>
Message-ID: <CACQ=j+fQUyweJZPTdsOSz+7ZgEV-H6BLKGYr2BH--fmaGGDMiQ@mail.gmail.com>
On Mon, May 5, 2014 at 5:42 AM, Cyril Concolato <
cyril.concolato@telecom-paristech.fr> wrote:

> Hi all,
>
> Some points worth highlighting/repeating:
>
> - Aside from the MP4 problem, the problem is general. Consider a DASH MPD
> pointing to a TTML file (not packaged in a MP4). How can a player know that
> it'll be able to play it meaningfully without downloading the entire TTML
> file?


TTML resources should have a declaration in either the root (<tt>) element
start tag, in the form of a ttp:profile attribute, that specifies the
processor profile required to process the document. Alternatively, it map
specify this using a ttp:profile element child of the tt:head element. This
information is designed to be near the beginning of the TTML resource so
that a pre-scan of, say, the first kilobyte, could use this information to
make a determination without having to read entire resource. This is rather
akin to the use of doctype or <meta charset=...> in HTML5.

If this information is not contained in the TTML resource as described
above, then a default processor profile is specified or implied by the
Document Interchange Context, and if that context is unknown or if it
doesn't specified imply a processor profile, then the DFXP Transformation
applies as a default.

In all cases, a processor profile is assigned to the resource by the above
rules.

For further details, see [1].

[1] http://www.w3.org/TR/2013/REC-ttml1-20130924/#vocabulary-profiles


> Does it actually need to know or will it always be able to do something
> with a TTML file?


Every conforming TTML processor is required to support the Generic
Processor Conformance [2] requirements, and, depending on whether it claims
to be a transformation processor or a presentation processor, is required
to support the Transformation Processor Conformance [3] and/or Presentation
Processor Conformance [4] requirements.

[2]
http://www.w3.org/TR/2013/REC-ttml1-20130924/#conformance-generic-processor
[3]
http://www.w3.org/TR/2013/REC-ttml1-20130924/#conformance-transformation-processor
[4]
http://www.w3.org/TR/2013/REC-ttml1-20130924/#conformance-presentation-processor

At a minimum, these requirements entail that the processor abort (not
further process) a TTML resource if (1) the referenced processor profile
specifies that some feature or extension is 'required' AND (2) the
processor does not support the required feature or extension AND (3) no
user-defined or system-defined abort override applies.


> The solution to this problem will provide the basis for the solution to
> the MP4 problem. Has the TTML WG considered defining a 'codecs' or
> 'profile' MIME parameter for TTML?
>
> - the codecs parameter of MP4 files is useful for identifying the content
> of each track in the file before the file is fully downloaded. In Adaptive
> Streaming contexts such as DASH, the initialization segment can also be
> downloaded in a first step to get information. So the codecs parameter does
> not have to provide all information. It could just indicate that the track
> contains some flavor of TTML and let the initialization segment provide
> more details. In that case, the solution could be as simple as:
> codecs="stpp.ttml"
>

I am personally very reticent to attempt to use the codecs parameter to
make fine grained decisions about whether a TTML can be processed. For
example, this won't work in general for documents that define the process
profile inline (i.e., via a ttp:profile element child of tt:head).

That is, I would prefer using the simple expression you specify above:
codecs="stpp.ttml". Any further determination about processability should
use the existing TTML profile mechanism described above.


>
> - If more than TTML is needed, remember that a typical workflow for
> packaging/dashing is: get one (or more) TTML file, package it into an
> (existing) MP4 file, produce an MPD from that MP4 file. Only in the last
> step is the 'codecs' parameter generated. In MSE cases, you don't even need
> the MPD but you need to get (most likely in JS) the codecs parameter to
> create the source buffer. Ideally, the TTML 'codecs' string generator
> should not have to look at the content of the TTML document, only at the
> track sample entry (similar to AVC codecs generation).
>

Agreed.


>
> Regards,
> Cyril
>
> 03/05/2014 20:01, David Singer a écrit :
>
>  On May 3, 2014, at 9:06 , Glenn Adams <glenn@skynav.com> wrote:
>>
>>
>>> On Fri, May 2, 2014 at 12:23 PM, David Singer <singer@apple.com> wrote:
>>>
>>> On Apr 30, 2014, at 16:40 , Michael Dolan <mdolan@newtbt.com> wrote:
>>>
>>>  Nigel, Dave and all-
>>>>
>>>> Is there a TTWG Proposal?
>>>>
>>>> Would “stpp” be registered somewhere where it would be unambiguous from
>>>> other Codecs strings unrelated to TTML?  I don’t mean the sample entry 4C
>>>> code, but in the Codecs string namespace.  Wouldn’t it have to be
>>>> “application/mp4+stpp….”?  Or why not use “application/ttml+xml” (and
>>>> similar for WebVTT)?
>>>>
>>> The codecs string starts with the 4CC of the sample entry.  After that,
>>> whoever defined that 4CC gets to say what’s the next element.  And they get
>>> to say what’s after that, and so on.
>>>
>>> stpp says “some sort of XML”,
>>>
>>> since VTT isn't XML, then would it use something different from stpp?
>>>
>> yes, we have different 4CCs for text-based and XML-based formats.  stpp
>> is for XML-based
>>
>>    and is owned by MPEG.  So indeed the step that goes from ‘stpp’ to
>>> ‘some sort of TTML’ is owned by MPEG, and MPEG still needs to resolve this.
>>>
>>> The thrust is that IF MPEG solves that, THEN there will be names to
>>> identify TTML dialects, that could go next.  So you would see
>>>
>>> codecs=stpp.<some MPEG magic to say it is TTML
>>> happens>.TTMLFULL+IMSC+SDPUS+EBUTT
>>>
>>> I would expect something more simple, e.g.
>>>
>>> stpp.vtt
>>> stpp.ttml.tt1f
>>> stpp.ttml.tt1p
>>> stpp.ttml.tt1t
>>> stpp.ttml.sdpu
>>> stpp.ttml.st10
>>> stpp.ttml.st13
>>>
>> indeed, MPEG is likely to say that TTML means “go to the W3C, thou
>> sluggard, and be wise”
>>
>>  where we have a registry for mapping the third IDs above to TTML profile
>>> designators, e.g.
>>>
>>> tt1f -> http://www.w3.org/ns/ttml/dfxp-full
>>> tt1p -> http://www.w3.org/ns/ttml/dfxp-presentation
>>> tt1t -> http://www.w3.org/ns/ttml/dfxp-transformation
>>> tt1u -> http://www.w3.org/ns/ttml/sdp-us
>>> s10f -> http://www.smpte-ra.org/schemas/2052-1/2010/profiles/
>>> smpte-tt-full
>>> s13f -> http://www.smpte-ra.org/schemas/2052-1/2013/profiles/
>>> smpte-tt-full
>>>
>>> we definitely do not want to create a new syntax/language for use in
>>> codecs that describes some way to combine profiles; that function is
>>> already defined by the ttml profile definition document syntax
>>>
>> we’re not, but we have to indicate somehow that a document is compatible
>> with more than one profile.  we also are not restricted here to a
>> 4-character-name, so we can use slightly longer names if they are mnemonic.
>>
>> we can’t split into several entries as you suggest;  there is one entry
>> per track, those separated by commas.
>>
>> codecs=stpp.ttml.tt1f,stpp.ttml.tt1p
>>
>> means two tracks, whereas
>>
>> codecs=stpp.ttml.tt1f+tt1p
>>
>> means one track compatible with two TTML profiles.  Big difference. (I am
>> not wedded to plus, but comma and period are both taken already)
>>
>>
>>> for example.
>>>
>>> For the TTWG to say “yes, we’ll take on dialect naming and forming that
>>> second-level parameter” is important; it then means that if MPEG finds a
>>> clean solution to the first level, the actual problem in hand is solved.
>>>  I’d like the MP4 people to realize before the July meeting that this is
>>> urgent, and come up with ideas and maybe online discussion ASAP.
>>>
>>> This is all provisional — on the TTWG getting agreement not only
>>> internally, but with the partners; and on us all liking the final result,
>>> of course.
>>>
>>> Makes sense?
>>>
>>>  I understand how one could signal profiles of TTML that a document
>>>> conformed to concurrently, as in the example – all of TTML and EBU-TT.  But
>>>> the signaling requirements go beyond that – there is often multiple
>>>> namespaces in use in one document that are not, as an aggregate, a single
>>>> “profile”. So, these must be explicitly signaled as well since nearly all
>>>> profiles permit foreign namespaces.  To accommodate this, the “short names”
>>>> have to be defined as “profiles of namespaces” I think.
>>>>
>>>> For example, if a document uses the CFF-TT text profile of the TTML
>>>> Full profile, plus SMPTE-TT #608 (US captions), plus CFF-TT metadata, and
>>>> it was compatible with IMSC, SDP-US and EBU-TT, then it might look like:
>>>> Codecs=xxxxx.TTMLFULL+IMSC+SDPUS+EBUTT+CFFT+CFFM+SMPTE608.
>>>>
>>>> Regards,
>>>>                  Mike
>>>>
>>>> p.s. I would give the SC29 Secretary a hint about the target of the
>>>> liaison (MPEG v JPEG).  And, you understand you will not receive a reply
>>>> until mid-to-late July, right?
>>>>
>>>>
>>>> From: Nigel Megitt [mailto:nigel.megitt@bbc.co.uk]
>>>> Sent: Wednesday, April 30, 2014 11:48 AM
>>>> To: watanabe@itscj.ipsj.or.jp
>>>> Cc: Timed Text Working Group
>>>> Subject: Liaison response - template on MIME type parameter for
>>>> TimedText
>>>>
>>>> Dear Mr. Watanabe,
>>>>
>>>> Thank you for your liaison N14444 of April 2014.
>>>>
>>>> We think that we can indeed find a solution together.  We are looking
>>>> into creating a table of formal "short names" for the profiles of W3C TTML
>>>> and
>>>> the profiles of formats derived from it (such as SMPTE-TT, EBU-TT, and
>>>> so on).  If MPEG were to propose how to step from the four-character-code of
>>>> the sample entry (XMLSubtitleSampleEntry and XMLMetaDataSampleEntry) to
>>>> something that identifies "a document compatible with one or more profiles
>>>> of TTML", then we could propose a string composed of a set of one or more
>>>> these short names as the next parameter.
>>>>
>>>> For example, say W3C defines two profile short names "W3CTTML" and
>>>> "EBUTTML", and MPEG defines the name "TTML" as referring to the overall
>>>> family, one might see
>>>>
>>>> codecs=stpp.TTML.W3CTTML+EBUTTML,avc1
>>>>
>>>> as a codecs string of a file carrying AVC (H.264) and TTML subtitles
>>>> that
>>>> are additionally EBU-TT conformant.
>>>>
>>>> We would check with those deriving from TTML (e.g. at SMPTE, EBU and
>>>> DECE) if this approach and design are acceptable, before we formalise this.
>>>>
>>>> Kind regards,
>>>>
>>>> Nigel Megitt, David Singer (chairs, Timed Text Working Group, W3C)
>>>>
>>>> --
>>>>
>>>> Nigel Megitt
>>>> Lead Technologist, BBC Technology, Distribution & Archives
>>>> Telephone: +44 (0)3030807996
>>>> Internal (Lync): 0807996
>>>> BC4 A3 Broadcast Centre, Media Village, 201 Wood Lane, London W12 7TP
>>>>
>>>>
>>>>
>>>> ----------------------------
>>>>
>>>> http://www.bbc.co.uk
>>>> This e-mail (and any attachments) is confidential and may contain
>>>> personal views which are not the views of the BBC unless specifically
>>>> stated.
>>>> If you have received it in error, please delete it from your system.
>>>> Do not use, copy or disclose the information in any way nor act in
>>>> reliance on it and notify the sender immediately.
>>>> Please note that the BBC monitors e-mails sent or received.
>>>> Further communication will signify your consent to this.
>>>>
>>>> ---------------------
>>>>
>>>>  David Singer
>>> Manager, Software Standards, Apple Inc.
>>>
>>>
>>>
>>>  David Singer
>> Manager, Software Standards, Apple Inc.
>>
>>
>>
>
> --
> Cyril Concolato
> Multimedia Group / Telecom ParisTech
> http://concolato.wp.mines-telecom.fr/
> @cconcolato
>
>
>
Received on Sunday, 11 May 2014 18:31:27 UTC