RE: TTML2: encoding from Nigel Megitt on 2014-09-05 (public-tt@w3.org from September 2014)

From: Nigel Megitt <nigel.megitt@bbc.co.uk>
Date: Fri, 5 Sep 2014 16:58:49 +0000
To: Glenn Adams <glenn@skynav.com>
CC: TTWG <public-tt@w3.org>
Message-ID: <5941EAB8802D6745A7D363D7B37BD1F749B53D41@BGB01XUD1012.national.core.bbc.co.uk>

Hmm maybe the existing 'should be UTF-8' is as strong as I would go anyway, since it implies 'should not be in another encoding' already.

We could consider also adding #utf-8 and #utf-16 feature designations (and #utf-32, why not). Arguably it would be tautological for a content profile defined in a TTML2 document, since that document's coding must already be known to process it, but for standalone content profiles it would be useful, also to specify processor profiles where those processors may not necessarily be fully compliant XML processors.

I must say I don't have a particularly strong feeling on this one but thought it would be worth a) informing the group about the move to CR of the W3C encoding technical report and b) reflecting that there is a general sense amongst TTML profilers that the possibility of multiple encodings creates more problems than it solves.


________________________________
From: Glenn Adams [glenn@skynav.com]
Sent: 05 September 2014 17:42
To: Nigel Megitt
Cc: TTWG
Subject: Re: TTML2: encoding

Well, one implementation supports both UTF-8 and UTF-16, namely TTV; it also supports UTF-32. At present, we don't mandate a particular concrete encoding of TTML1 or TTML2, so I think it is best left as is. It would be strange to disallow an encoding that is sanctioned by XML but not required be supported by TTML.


On Fri, Sep 5, 2014 at 9:29 AM, Nigel Megitt <nigel.megitt@bbc.co.uk<mailto:nigel.megitt@bbc.co.uk>> wrote:
The argument to change comes from implementations, where there seems to be a strong desire to settle on a single encoding. Simply because XML processors in general must be able to read entities in UTF-16 doesn't mean that TTML2 documents must be conformant if they're UTF-16.

In other words, we do at least have an option to be more restrictive, and the industry seems to be heading in that direction.


________________________________
From: Glenn Adams [glenn@skynav.com<mailto:glenn@skynav.com>]
Sent: 05 September 2014 17:11
To: Nigel Megitt
Cc: TTWG
Subject: Re: TTML2: encoding

I see no reason to change. We recommend (SHOULD USE) utf-8. XML itself also states "All XML processors must be able to read entities in both the UTF-8 and UTF-16 encodings. " [1], which text applies also to TTML as we normatively include it.

[1] http://www.w3.org/TR/REC-xml/#charencoding


On Fri, Sep 5, 2014 at 2:25 AM, Nigel Megitt <nigel.megitt@bbc.co.uk<mailto:nigel.megitt@bbc.co.uk>> wrote:
In the context of http://www.w3.org/TR/encoding/ (for which transition to CR has just been requested) should we deprecate any other encoding than UTF-8 for TTML2?

--

Nigel Megitt
Lead Technologist, BBC Technology, Distribution & Archives
Telephone: +44 (0)208 0082360<tel:%2B44%20%280%29208%200082360>
Telephone (Lync): +44 (0)3030807996<tel:%2B44%20%280%293030807996>
Lync internal: 0807996
BC4 A3 Broadcast Centre, Media Village, 201 Wood Lane, London W12 7TP

Received on Friday, 5 September 2014 16:59:22 UTC