Re: [Fwd: [IANA #628625] Request for MIME media type Application/Standards Tree - ttml+xml]

On Thursday, November 15, 2012, 12:57:33 PM, Philippe wrote:

PLH> From: Amanda Baber via RT <iana-mime@iana.org>


>> Optional parameters :
>> charset
>> Same as application/xml media type, as specified in RFC 3023 or its
>> successors.

PLH> Since we cannot in general be sure what successors to RFC 3023 will do,
PLH> it probably isn't appropriate to assume that successors will necessarily
PLH> be in sync with this specification. So this should, in absence of other
PLH> considerations, be reduced to a reference to RFC 3023.

I have a better idea about what the successor to RFC 3023 will do, as  I am one of the authors. Mainly, this involves not claiming to do some unusual things that RFC 3023 required, although in practice implementations did not do them and they were often untested in test suites.

For those interested, the current draft is
http://tools.ietf.org/html/draft-ietf-appsawg-xml-mediatypes-00

RFC 3023 required some unusual things regarding character encoding and the precedence of the HTTP charset parameter over the encoding declared in the XML instance. In particular:

- if the HTTP charset has a different value to the XML document encoding, the XML document encoding is ignored and the HTTP value is used.
(I verified with PLH that the TT test suite does not have a test case where the XML instance is, say, 8859-1, the HTTP header says it is UTF-8, and the pass criterion is that the parser give a well formedness error even though the document is well formed when tested from local disk).
- if there is no HTTP charset parameter, HTTP is still considered to produce a default value of US-ASCII and still overrides what the XML instance says (for text/*+xml only)
- when the HTTP charset and the XML encoding declaration conflict, then when saving to local disk 3023 required that either a MIME-aware filesystem be used to preserve the headers (!) or that the XML instance be rewritten to change the encoding declaration.

The replacement to 3023 takes a different approach which is much more in line with what implementations do. In brief:
- XML processors rely on the XML encoding declaration (and may not even have access to HTTP headers) so the XML encoding declaration is authoritative.
- HTTP no longer has a default encoding (this changed with HTTP-bis, previously the default was Latin-1). An HTTP charset parameter *may* be supplied, but if so it must be the same as the XML encoding declaration. If there is no HTTP charset parameter, this now means 'no information in HTTP, see the XML' not 'force to US-ASCII and make ill-formed'.
- if the HTTP charset and the XML encoding declaration have different values, there is poor interop, *so don't do that*.

Here is some suggested text which should be compatible with what the replacement for 3023 will say, is in line with implementations, and doesn't make promises for functionality which is untested or unimplemented.

----------- start -----------

Optional parameters :
  charset

If supplied, the charset parameter must match the XML encoding declaration.
 
------------- end -----------


-- 
 Chris Lilley   Technical Director, Interaction Domain                 
 W3C Graphics Activity Lead, Fonts Activity Lead
 Co-Chair, W3C Hypertext CG
 Member, CSS, WebFonts, SVG Working Groups

Received on Wednesday, 28 November 2012 17:07:08 UTC