- From: Kay, Michael <Michael.Kay@softwareag.com>
- Date: Mon, 18 Nov 2002 21:14:15 +0100
- To: David Carlisle <davidc@nag.co.uk>, public-qt-comments@w3.org
> -----Original Message----- > From: David Carlisle [mailto:davidc@nag.co.uk] > Sent: 18 November 2002 12:13 > To: public-qt-comments@w3.org > Subject: XSLT 15th Nov: Text Output Method: unencoded characters > > > > > > If the result tree contains a character that cannot be > represented in > the encoding that the processor is using for output, the > implementation > should signal a serialization error. > > This is compatible with XSLT1 but it would be useful extra > functionality if there was an option available in xsl:output > method="text" to output unencoded characters. The format for > unencded characters isn't so important, and I'd be happy for > the format to be fixed in the specification, although > obviously one could imagine a more complex scheme that > allowed this to be specified. > > obvious candidates would be > > &1234; > \uabc > U+1234 > possibly the latter is most "plain text like", being > Unicode's format for references to unicode characters in plain text. > > In XSLT 1 I often find myself using the xml output method > (with ascii or latin 1 encoding) even when outputting text > files, just so that I get all characters output in a > consistent manner. (The exact format doesn't matter as I post > process the output with sed or perl to pick up all the non > ascii characters and encode them as needed (as TeX commands, > as often as not). It is tiresome in XSLT1 to detect all non > ascii characters and output them in some non standard format. > XSLT 2 regexp would make this a little easier but it would > still complicate the stylesheet greatly if every template > generating text in the result document had to run a template > to quote every non ascii character. It's much more convenient > to let the characters go to the result tree as characters and > deal with the quoting required for the text format as a > serialisation issue. > Please see issues 15 and 124. We are considering a proposal to allow the serialization of individual characters to be defined. This is seen as an alternative to "sticky disable-output-escaping", on the basis that the only known use cases for sticky doe are to include "non-standard" characters in the output. The thinking is to allow users to include characters from the private use area into text nodes, and then control how these are subsequently serialized. I would envisage this applying to all output methods. Michael Kay
Received on Monday, 18 November 2002 15:14:22 UTC