- From: Glenn Adams <glenn@skynav.com>
- Date: Thu, 17 Jan 2013 09:03:35 -0700
- To: Michael Dolan <mdolan@newtbt.com>
- Cc: public-tt@w3.org
- Message-ID: <CACQ=j+cQe9Uv9N7ch7iydWt+oXBuUpgvMc8Cb6dV1L4VoJME+A@mail.gmail.com>
a similar equivalence is indicated in HTML5, see http://www.w3.org/TR/html5/single-page.html#attr-meta-http-equiv-content-type On Thu, Jan 17, 2013 at 7:47 AM, Glenn Adams <glenn@skynav.com> wrote: > the "charset" parameter of MIME types and the "encoding" parameter for the > XML declaration are effectively (if not identically) synonymous; in general > (but not always) they map to both a character repertoire (a character set) > and an on the wire encoding of strings that employ that repertoire > > > On Wed, Jan 16, 2013 at 2:00 PM, Michael Dolan <mdolan@newtbt.com> wrote: > >> (per my AI and for discussion in tomorrow’s meeting)**** >> >> ** ** >> >> TTML 1.0 defines an media type “application/ttml+xml” in Appendix C:**** >> >> >> https://dvcs.w3.org/hg/ttml/raw-file/tip/ttml10/spec/ttaf1-dfxp.html#media-type-registration >> **** >> >> We are in the process of submitting a media type registration to IANA, >> revised from what was published in TTML 1.0.**** >> >> ** ** >> >> Both the parameter and encoding considerations sections of the >> registration refer to “application/xml” (section 3.2) defined in RFC 3023, >> “XML Media Types”:**** >> >> http://www.rfc-editor.org/rfc/rfc3023.txt **** >> >> ** ** >> >> An optional charset parameter is defined. The value of charset is >> entirely unconstrained. RFC 3023 seems to mix charset (e.g. 8859-1) with >> encoding (e.g. utf-8) which adds a layer of confusion.**** >> >> ** ** >> >> Character encoding requirements XML in general are in section 4.3.3 of >> the XML 1.0 spec (RFC 3023 cites XML 1.0):**** >> >> http://www.w3.org/TR/REC-xml/#charencoding **** >> >> and optionally, the algorithm defined in the informative Appendix F:**** >> >> http://www.w3.org/TR/REC-xml/#sec-guessing**** >> >> ** ** >> >> There are a variety of scenarios for which the charset/encoding cannot be >> determined. So, in the end, there is no deterministic way to deduce the >> charset/encoding from the file alone. The media type charset parameter or >> some other external signaling means is required. Most file systems do not >> include this metadata. This makes file exchange problematic.**** >> >> ** ** >> >> RFC 3023 makes some specific comments and recommendations in this area:** >> ** >> >> ** ** >> >> Although listed as an optional parameter, the use of the charset >> parameter is STRONGLY RECOMMENDED, since this information can be used by >> XML processors to determine authoritatively the charset of the XML MIME >> entity.**** >> >> ** ** >> >> "utf-8" [RFC2279] and "utf-16" [RFC2781] are the recommended values, >> representing the UTF-8 and UTF-16 charsets, respectively. These charsets >> are preferred since they are supported by all conforming processors of >> [XML].**** >> >> ** ** >> >> I recommend that TTWG follow the RFC 3023 recommendation and clarify that >> the “application/ttml-xml” media type be constrained to utf-8 and utf-16 >> encoding only. Given the mixing of semantics for “charset”, I recommend we >> remain silent on that optional parameter, since, with this constraint, >> explicit signaling is not required. The other encoding consideration of RFC >> 3023 still apply.**** >> >> ** ** >> >> Regards,**** >> >> ** ** >> >> Mike**** >> >> ** ** >> >> ** ** >> >> Michael A DOLAN**** >> >> TBT, Inc. PO Box 190**** >> >> Del Mar, CA 92014**** >> >> (m) 858-882-7497**** >> >> ** ** >> > >
Received on Thursday, 17 January 2013 16:04:26 UTC