[DR609] feedback

On the 6Dec2000 concall, it was mentioned that the group would seek
feedback before finalizing the wording on DR609. Here are some 
comments.

I agree with the intent of latest wording proposal. There will be 
those who wish to implement XP without being forced to use UTF-8 (or 
whatever encoding we may recommend) and they should be accommodated. 
However, interoperability is a major issue here. 

The XML 1.0 spec, sec 4.3.3, states:
  
  In the absence of information provided by an external transport 
  protocol (e.g. HTTP or MIME), it is an error for an entity including 
  an encoding declaration to be presented to the XML processor in an 
  encoding other than that named in the declaration, for an encoding 
  declaration to occur other than at the beginning of an external entity, 
  or for an entity which begins with neither a Byte Order Mark nor an 
  encoding declaration to use an encoding other than UTF-8.   

I take this to mean that the XML 1.0 must be UTF-8, unless
(1) the external transport specifies an encoding, or
(2) the "<?xml ... ?>" PI includes an encoding declaration, or
(3) the data begins with a BOM (indicating UTF-16). 

I think that it would be reasonable for the WG to recommend that for 
maximum interoperability UTF-8 (the XML default) should be used. 
With any other encoding (except UTF-16) there is a risk that the
receiver may fail because it does not support the encoding. If the 
user accepts that risk (he may control all receivers he's going to send
to) he may use a national encoding. But to be XML 1.0 compliant, he  
must specify the encoding via the external transport or the XML prolog.

I think that pursuant to our goals of easy implementation and 
interoperability, it would make sense for us to recommend UTF-8. 

  Randy Waldrop
  webMethods, Inc.

Received on Thursday, 7 December 2000 10:21:04 UTC