- From: Martin Duerst <duerst@w3.org>
- Date: Thu, 11 Nov 2004 15:27:52 +0900
- To: Chris Lilley <chris@w3.org>, "Williams, Stuart (HP Labs, Bristol)" <skw@hp.com>, MURATA Makoto <murata@hokkaido.email.ne.jp>
- Cc: www-tag@w3.org, MURATA Makoto <eb2m-mrt@asahi-net.or.jp>
At 05:01 04/11/09, Chris Lilley wrote: >disagreement > >- charset. Still says optional but strongly recommended, TAG wants >optional and only supplied if correct. TAG wants default to be 'see xml >encoding declaration'. So, some media types could omit the parameter >entirely. However, attempt to register image/svg+xml without a charset >(backed up by pointing to TAG findings) met resistance. Still therefore >a conflict with TAG findings. >http://www.imc.org/ietf-xml-mime/mail-archive/msg00978.html I think it is very important to distinguish two levels: 1) What is required, recommended, or allowed on the type registration level 2) What is required, recommended, or allowed on the level of each document The "Good Practice" in the Web Architecture Document, as far as I understand, refers to 2): >>>>>>>> Good practice: XML and character encodings In general, a representation provider SHOULD NOT specify the character encoding for XML data in protocol headers since the data is self-describing. >>>>>>>> It speaks explicitly about the representation provider, which I can only interpret as 2). With respect to 1), the "In general" and the "SHOULD NOT" in the "Good Practise" seem to imply that there may be valid reasons to specify the character encoding for XML data in protocol headers. This in turn implies that it is a good thing to allow the 'charset' parameter in mime type registrations. So I don't see any conflict between the "Good Practice" in the Web Architecture Document and requesting that image/svg+xml allows a charset parameter. Indeed, in my understanding, this works very well together. After all, the reasons for why one may want to use a charset parameter (APIs and databases that do transcoding on output and set the parameter automatically,...) are very much orthogonal from the specific mime type. So back to the RFC 3023 update. >- charset. Still says optional but strongly recommended, TAG wants >optional and only supplied if correct. Nobody wants wrong information, anyway. I don't think there is any disagreement there. I don't think RFC 3023 says "supply it, even if it's wrong". I think that on the level of instances (see 2) above), "strongly recommended" may be too strong. On the level of mime type registrations (see 1) above), I think it is appropriate to keep "strongly recommended". The main exception that I can see are formats (mostly used on a protocol level) that for efficiency and interoperability reasons restrict the specific format to use only UTF-8. In that case, the mime type registration can very well say that there is no charset parameter, because it supplies no additional information. Even generic XML processors will be able to deal with this, without any misintepretation. And it is hoped that mime-type specific implementations don't blow up in the case that the type is served with an accidental 'charset'. >TAG wants default to be 'see xml encoding declaration'. What do you (or the TAG) mean by "default" in this context? The XML Recommendation as well as RFC 3023,... give a clear priority to the charset information in the Content-Type header. To switch this around after having it defined like this for about 10 years (starting with HTML i18n or HTML 4.0 or so) would be a very bad idea. >Still therefore >a conflict with TAG findings. >http://www.imc.org/ietf-xml-mime/mail-archive/msg00978.html I'm confused. First, that mail points to http://www.w3.org/2001/tag/2002/0129-mime#char-encoding, whereas there is a newer version of this document at http://www.w3.org/2001/tag/2004/0430-mime, which is also the one pointed to from http://www.w3.org/2001/tag/findings. I don't see much of a difference between the respective sections (numbered differently in the above versions because another section was removed). And I don't see a big conflict between the TAG finding on the one hand, and updating RFC 3023 or using a charset parameter for image/svg+xml on the other hand. In particular, that section of the TAG finding, overall, seems to suggest to replace <<<<<<<< The use of the charset parameter is STRONGLY RECOMMENDED, since this information can be used by XML processors to determine authoritatively the charset of the XML MIME entity. <<<<<<<< with >>>>>>>> The use of the charset parameter, when the charset is reliably known and agrees with the encoding declaration, is RECOMMENDED, since this information can be used by non-XML processors to determine authoritatively the charset of the XML MIME entity. >>>>>>>> I do not see any very big problem with such a change, but of course the details of the wording should be discussed on the relevant mailing list rather than prescribed by the TAG. I do not see any way to deduce from the above text proposed by the TAG that it is a good idea to disallow the 'charset' parameter on certain media types. On the contrary, the above wording seems to suggest to me that it is a good idea to have such a parameter. There ARE implementations out there that are actually sure about what character encoding their data is in, and there is a benefit for non-XML processors to be able to determine that encoding. Regards, Martin.
Received on Thursday, 11 November 2004 06:29:14 UTC