- From: Glenn Adams <glenn@skynav.com>
- Date: Thu, 17 Jan 2013 07:47:28 -0700
- To: Michael Dolan <mdolan@newtbt.com>
- Cc: public-tt@w3.org
- Message-ID: <CACQ=j+f+zNR3qB-jK=F7NnZubZcAXMKScaO_dTmjAJuSikO38Q@mail.gmail.com>
the "charset" parameter of MIME types and the "encoding" parameter for the XML declaration are effectively (if not identically) synonymous; in general (but not always) they map to both a character repertoire (a character set) and an on the wire encoding of strings that employ that repertoire On Wed, Jan 16, 2013 at 2:00 PM, Michael Dolan <mdolan@newtbt.com> wrote: > (per my AI and for discussion in tomorrow’s meeting)**** > > ** ** > > TTML 1.0 defines an media type “application/ttml+xml” in Appendix C:**** > > > https://dvcs.w3.org/hg/ttml/raw-file/tip/ttml10/spec/ttaf1-dfxp.html#media-type-registration > **** > > We are in the process of submitting a media type registration to IANA, > revised from what was published in TTML 1.0.**** > > ** ** > > Both the parameter and encoding considerations sections of the > registration refer to “application/xml” (section 3.2) defined in RFC 3023, > “XML Media Types”:**** > > http://www.rfc-editor.org/rfc/rfc3023.txt **** > > ** ** > > An optional charset parameter is defined. The value of charset is > entirely unconstrained. RFC 3023 seems to mix charset (e.g. 8859-1) with > encoding (e.g. utf-8) which adds a layer of confusion.**** > > ** ** > > Character encoding requirements XML in general are in section 4.3.3 of the > XML 1.0 spec (RFC 3023 cites XML 1.0):**** > > http://www.w3.org/TR/REC-xml/#charencoding **** > > and optionally, the algorithm defined in the informative Appendix F:**** > > http://www.w3.org/TR/REC-xml/#sec-guessing**** > > ** ** > > There are a variety of scenarios for which the charset/encoding cannot be > determined. So, in the end, there is no deterministic way to deduce the > charset/encoding from the file alone. The media type charset parameter or > some other external signaling means is required. Most file systems do not > include this metadata. This makes file exchange problematic.**** > > ** ** > > RFC 3023 makes some specific comments and recommendations in this area:*** > * > > ** ** > > Although listed as an optional parameter, the use of the charset parameter > is STRONGLY RECOMMENDED, since this information can be used by XML > processors to determine authoritatively the charset of the XML MIME entity. > **** > > ** ** > > "utf-8" [RFC2279] and "utf-16" [RFC2781] are the recommended values, > representing the UTF-8 and UTF-16 charsets, respectively. These charsets > are preferred since they are supported by all conforming processors of > [XML].**** > > ** ** > > I recommend that TTWG follow the RFC 3023 recommendation and clarify that > the “application/ttml-xml” media type be constrained to utf-8 and utf-16 > encoding only. Given the mixing of semantics for “charset”, I recommend we > remain silent on that optional parameter, since, with this constraint, > explicit signaling is not required. The other encoding consideration of RFC > 3023 still apply.**** > > ** ** > > Regards,**** > > ** ** > > Mike**** > > ** ** > > ** ** > > Michael A DOLAN**** > > TBT, Inc. PO Box 190**** > > Del Mar, CA 92014**** > > (m) 858-882-7497**** > > ** ** >
Received on Thursday, 17 January 2013 14:48:17 UTC