RE: [Fwd: [IANA #628625] Request for MIME media type Application/Standards Tree - ttml+xml]

This seems better.  However, the XML encoding attribute is optional.  So,
perhaps it should say something like:  " If supplied, the charset parameter
must match the XML encoding declaration, or if absent, the actual encoding"?

-----Original Message-----
From: Chris Lilley [mailto:chris@w3.org] 
Sent: Wednesday, November 28, 2012 9:07 AM
To: Philippe Le Hegaret
Cc: public-tt
Subject: Re: [Fwd: [IANA #628625] Request for MIME media type
Application/Standards Tree - ttml+xml]

On Thursday, November 15, 2012, 12:57:33 PM, Philippe wrote:

PLH> From: Amanda Baber via RT <iana-mime@iana.org>


>> Optional parameters :
>> charset
>> Same as application/xml media type, as specified in RFC 3023 or its 
>> successors.

PLH> Since we cannot in general be sure what successors to RFC 3023 will 
PLH> do, it probably isn't appropriate to assume that successors will 
PLH> necessarily be in sync with this specification. So this should, in 
PLH> absence of other considerations, be reduced to a reference to RFC 3023.

I have a better idea about what the successor to RFC 3023 will do, as  I am
one of the authors. Mainly, this involves not claiming to do some unusual
things that RFC 3023 required, although in practice implementations did not
do them and they were often untested in test suites.

For those interested, the current draft is
http://tools.ietf.org/html/draft-ietf-appsawg-xml-mediatypes-00

RFC 3023 required some unusual things regarding character encoding and the
precedence of the HTTP charset parameter over the encoding declared in the
XML instance. In particular:

- if the HTTP charset has a different value to the XML document encoding,
the XML document encoding is ignored and the HTTP value is used.
(I verified with PLH that the TT test suite does not have a test case where
the XML instance is, say, 8859-1, the HTTP header says it is UTF-8, and the
pass criterion is that the parser give a well formedness error even though
the document is well formed when tested from local disk).
- if there is no HTTP charset parameter, HTTP is still considered to produce
a default value of US-ASCII and still overrides what the XML instance says
(for text/*+xml only)
- when the HTTP charset and the XML encoding declaration conflict, then when
saving to local disk 3023 required that either a MIME-aware filesystem be
used to preserve the headers (!) or that the XML instance be rewritten to
change the encoding declaration.

The replacement to 3023 takes a different approach which is much more in
line with what implementations do. In brief:
- XML processors rely on the XML encoding declaration (and may not even have
access to HTTP headers) so the XML encoding declaration is authoritative.
- HTTP no longer has a default encoding (this changed with HTTP-bis,
previously the default was Latin-1). An HTTP charset parameter *may* be
supplied, but if so it must be the same as the XML encoding declaration. If
there is no HTTP charset parameter, this now means 'no information in HTTP,
see the XML' not 'force to US-ASCII and make ill-formed'.
- if the HTTP charset and the XML encoding declaration have different
values, there is poor interop, *so don't do that*.

Here is some suggested text which should be compatible with what the
replacement for 3023 will say, is in line with implementations, and doesn't
make promises for functionality which is untested or unimplemented.

----------- start -----------

Optional parameters :
  charset

If supplied, the charset parameter must match the XML encoding declaration.
 
------------- end -----------


-- 
 Chris Lilley   Technical Director, Interaction Domain                 
 W3C Graphics Activity Lead, Fonts Activity Lead  Co-Chair, W3C Hypertext CG
Member, CSS, WebFonts, SVG Working Groups

Received on Wednesday, 28 November 2012 20:32:30 UTC