- From: Randy Waldrop <rwaldrop@webmethods.com>
- Date: Thu, 7 Dec 2000 10:04:16 -0500
- To: <xml-dist-app@w3.org>
On the 6Dec2000 concall, it was mentioned that the group would seek feedback before finalizing the wording on DR609. Here are some comments. I agree with the intent of latest wording proposal. There will be those who wish to implement XP without being forced to use UTF-8 (or whatever encoding we may recommend) and they should be accommodated. However, interoperability is a major issue here. The XML 1.0 spec, sec 4.3.3, states: In the absence of information provided by an external transport protocol (e.g. HTTP or MIME), it is an error for an entity including an encoding declaration to be presented to the XML processor in an encoding other than that named in the declaration, for an encoding declaration to occur other than at the beginning of an external entity, or for an entity which begins with neither a Byte Order Mark nor an encoding declaration to use an encoding other than UTF-8. I take this to mean that the XML 1.0 must be UTF-8, unless (1) the external transport specifies an encoding, or (2) the "<?xml ... ?>" PI includes an encoding declaration, or (3) the data begins with a BOM (indicating UTF-16). I think that it would be reasonable for the WG to recommend that for maximum interoperability UTF-8 (the XML default) should be used. With any other encoding (except UTF-16) there is a risk that the receiver may fail because it does not support the encoding. If the user accepts that risk (he may control all receivers he's going to send to) he may use a national encoding. But to be XML 1.0 compliant, he must specify the encoding via the external transport or the XML prolog. I think that pursuant to our goals of easy implementation and interoperability, it would make sense for us to recommend UTF-8. Randy Waldrop webMethods, Inc.
Received on Thursday, 7 December 2000 10:21:04 UTC