Re: WD-xslt-xquery-serialization-20030502 omit-xml-declaration [Issue qt-2003Nov0050-01]

Hi David,

David Carlisle wrote on 2003-11-12 09:51:00 AM:
> According to 
> http://www.w3.org/TR/2003/WD-xslt-xquery-serialization-20030502/#N400318
> 
> 
>    The omit-xml-declaration parameter should be ignored if the 
standalone
>    parameter is present, or if the encoding parameter specifies a value
>    other than UTF-8 or UTF-16.
> 
> There is one other case where it would be very useful to omit the
> declaration (or at least to use a value of utf-8) namely
> iso-646 (aka ASCII aka US-ASCII).
> 
> It may be politically incorrect to say that ascii characters are still
> more interoperable than non-ascii characters, but in practice this is
> still the case. Especially in XML which specifies that a charset
> specified in the mime headers takes precedence it is hard to give (say) 
a
> utf8 file to someone to serve from their website without first finding
> out what http server they use, and how to make sure it won't serve the
> thing as latin 1 resulting in a non-well formed document.
> (See current discussion on W3C'S TAG list about this).
> 
> One style of producing XML files that avoids these problems is to
> produce files that don't have an xml declaration (or have one that
> specifies utf-8) but to encode all non-ascii characters as numeric
> character references.
> 
> Currently in an XSLT 1 usage in production I use
> <xsl:output encoding="US-ASCII"/>
> with saxon and post process with sed to remove the US-ASCII
> encoding declaration (which stops the file being parsed on several XML
> systems I have locally) I think that it would be very desirable if
> 
> <xsl:output encoding="iso-646" omit-xml-declaration="yes"/>
> 
> was defined to work, and produce files of the form described above.
> 
> Failing that it would be good if it would be allowed by the
> specification if the system understood that encoding.

     Thank you for raising this issue.  The XSL and XQuery working groups 
discussed the issue and decided not to require processors to support the 
US-ASCII encoding and its aliases.  The working groups decided that the 
appropriate way of addressing your comment would be to replace the second 
paragraph of Section 4.5 of the last call working draft of XSLT 2.0 and 
XQuery 1.0 Serialization [1], which currently reads:

<<
The omit-xml-declaration parameter must be ignored if the standalone 
parameter is present, or if the encoding parameter specifies a value other 
than UTF-8 or UTF-16.
>>

with the following:

<<
The omit-xml-declaration parameter must be ignored if the standalone 
parameter is present, or if the encoding parameter specifies a value that 
is not UTF-8, UTF-16 or a subset of either of those encodings.  An 
encoding S is a subset of another encoding E if the set of codepoints that 
can be encoded in S is a subset of those that can be encoded in B, and the 
encodings of those codepoints in S is the same as the encodings of those 
same codepoints in encoding E.
>>

     That resolution seems to be in accord with the last sentence of your 
comment.  Please let us know whether you consider this resolution 
acceptable.

Thanks,

Henry
[1] 
http://www.w3.org/TR/2003/WD-xslt-xquery-serialization-20031112/#N105F3
------------------------------------------------------------------
Henry Zongaro      Xalan development
IBM SWS Toronto Lab   T/L 969-6044;  Phone +1 905 413-6044
mailto:zongaro@ca.ibm.com

Received on Sunday, 29 February 2004 12:16:12 UTC