- From: Henry Zongaro <zongaro@ca.ibm.com>
- Date: Sun, 29 Feb 2004 12:16:07 -0500
- To: David Carlisle <davidc@nag.co.uk>
- Cc: public-qt-comments@w3.org
Hi David, David Carlisle wrote on 2003-11-12 09:51:00 AM: > According to > http://www.w3.org/TR/2003/WD-xslt-xquery-serialization-20030502/#N400318 > > > The omit-xml-declaration parameter should be ignored if the standalone > parameter is present, or if the encoding parameter specifies a value > other than UTF-8 or UTF-16. > > There is one other case where it would be very useful to omit the > declaration (or at least to use a value of utf-8) namely > iso-646 (aka ASCII aka US-ASCII). > > It may be politically incorrect to say that ascii characters are still > more interoperable than non-ascii characters, but in practice this is > still the case. Especially in XML which specifies that a charset > specified in the mime headers takes precedence it is hard to give (say) a > utf8 file to someone to serve from their website without first finding > out what http server they use, and how to make sure it won't serve the > thing as latin 1 resulting in a non-well formed document. > (See current discussion on W3C'S TAG list about this). > > One style of producing XML files that avoids these problems is to > produce files that don't have an xml declaration (or have one that > specifies utf-8) but to encode all non-ascii characters as numeric > character references. > > Currently in an XSLT 1 usage in production I use > <xsl:output encoding="US-ASCII"/> > with saxon and post process with sed to remove the US-ASCII > encoding declaration (which stops the file being parsed on several XML > systems I have locally) I think that it would be very desirable if > > <xsl:output encoding="iso-646" omit-xml-declaration="yes"/> > > was defined to work, and produce files of the form described above. > > Failing that it would be good if it would be allowed by the > specification if the system understood that encoding. Thank you for raising this issue. The XSL and XQuery working groups discussed the issue and decided not to require processors to support the US-ASCII encoding and its aliases. The working groups decided that the appropriate way of addressing your comment would be to replace the second paragraph of Section 4.5 of the last call working draft of XSLT 2.0 and XQuery 1.0 Serialization [1], which currently reads: << The omit-xml-declaration parameter must be ignored if the standalone parameter is present, or if the encoding parameter specifies a value other than UTF-8 or UTF-16. >> with the following: << The omit-xml-declaration parameter must be ignored if the standalone parameter is present, or if the encoding parameter specifies a value that is not UTF-8, UTF-16 or a subset of either of those encodings. An encoding S is a subset of another encoding E if the set of codepoints that can be encoded in S is a subset of those that can be encoded in B, and the encodings of those codepoints in S is the same as the encodings of those same codepoints in encoding E. >> That resolution seems to be in accord with the last sentence of your comment. Please let us know whether you consider this resolution acceptable. Thanks, Henry [1] http://www.w3.org/TR/2003/WD-xslt-xquery-serialization-20031112/#N105F3 ------------------------------------------------------------------ Henry Zongaro Xalan development IBM SWS Toronto Lab T/L 969-6044; Phone +1 905 413-6044 mailto:zongaro@ca.ibm.com
Received on Sunday, 29 February 2004 12:16:12 UTC