Re: XML media types, charset, TAG findings

On Thursday, October 7, 2004, 5:27:53 PM, Bjoern wrote:


BH> * Chris Lilley wrote:
>>Coupled with the deprecation of the text/xml and
>>text/xml-external-parsed-entity types (and thus insulation from the
>>particular encoding testrictions of text/*) we are now, in this revision
>>of the document, in a position to be a little stronger:
>>
>>  The encoding declaration in an XML document and the charset (if
>>  provided) MUST be consistent.

BH> That is insufficient as it does not define what it means for these to be
BH> consistent, how implementations are required to determine whether this
BH> requirement has been met and what processors are required to do when
BH> these are determined to be inconsistent. Without a complete proposal it
BH> is most difficult to cite any reactions on this matter. I would
BH> generally support removing the often ignored complexity that the charset
BH> parameter introduces, with your proposal however, even if completely
BH> specified, I would worry that this increases the complexity rather than
BH> removing it in which case this would seem counter-productive.

This is a reasonable worry.

My preference would be to not have a redundant charset parameter, since
that would remove the ambiguity. However, I realize that people are
uncomfortable with that and thus propose this solution.

Consistent means that the encoding determined by
F Autodetection of Character Encodings (Non-Normative)
http://www.w3.org/TR/REC-xml/#sec-guessing

is either the same as the value of the charset pArameter, or the charset
parameter is not provided.

There are examples of this in the current internet draft, for various
cases including a specified encoding declaration, an absent encoding
declaration with or without assorted BOMs.



-- 
 Chris Lilley                    mailto:chris@w3.org
 Chair, W3C SVG Working Group
 Member, W3C Technical Architecture Group

Received on Thursday, 7 October 2004 15:56:04 UTC