- From: Grosso, Paul <pgrosso@ptc.com>
- Date: Fri, 7 Dec 2007 16:52:19 -0500
- To: <public-xml-core-wg@w3.org>
If a serialized XML document contains: <!--This is a comment — pbg--> or <?myproc pseudoatt="this is part of a pi — pbg"?> then when that is read by an XML processor, is the — considered to be a seven character string or the Unicode em-dash character? More precisely, in the infoset of such a document, when considering the comment or PI's [content] info item, would the length of the "string representing the content" be calculated with the "—" part contributing 1 or 7 to the length? Put another way, if the following XSLT template matched the above comment, should the xsl:if test succeed or fail: <xsl:template match="comment()"> <xsl:if test="string(.)='This is a comment - pbg'"> <!-- The above line's em-dash is the single U-2014 character --> </xsl:if> </xsl:template> paul
Received on Friday, 7 December 2007 21:53:18 UTC