Re: [Serial] I18N WG last call comments [16]

Martin Duerst wrote:

> In XQuery, the default seems to be 'implementation-defined':
> 
> http://www.w3.org/TR/2004/WD-xquery-20040723/#id-xq-serialization-parameters
> 
> 
> We are not at all convinced that this will lead to the necessary 
> degree of interoperability.

He was referring to this earlier comment:

>> [16] Section 4.2 (XML output method, encoding): "If no encoding
>> parameter is specified, then the processor must use either UTF-8 or
>> UTF-16.": It may be desirable to further narrow this to UTF-8 for
>> higher predictability. On the other hand, this should not say "If
>> no encoding parameter is specified", but "If no encoding is
>> specified (either with an encoding parameter or externally)" to
>> allow e.g. specification of encoding with an option.

Hi Martin,

According to the XML Spec:

> All XML processors MUST accept the UTF-8 and UTF-16 encodings of
> Unicode 3.1 [Unicode3]; the mechanisms for signaling which of the two
> is in use, or for bringing other encodings into play, are discussed
> later, in 4.3.3 Character Encoding in Entities.

Serialization produces XML for XML processors. Since all XML processors 
are required to accept the encodings that XQuery serialization is 
allowed to produce, the distinction between the two encodings should not 
make a difference unless an XML processor fails to implement the XML 
specification.

Are you suggesting that XML processors should not be required to accept 
both encodings? It's true that supporting both encodings complicates 
implementations, especially when the various normalizations are taken 
into account. But in XML, I think that's a done deal, and I think that 
we incurred this complication largely at the urging of the I18N community.

Jonathan
Not on behalf of anybody.

Received on Tuesday, 26 October 2004 14:20:33 UTC