Re: [XSLT20] line ends in unparsed-text()

following up on an earlier comment (and Michael Kay's reply)

http://lists.w3.org/Archives/Public/public-qt-comments/2004Aug/0086.html

I see in the new XSLT draft that unparsed-text() doesn't normalize line
ends. I think normalizing would be better but I can live with this.

A new example has been added to clarify that line ends are not
normalized, which is good (if they are not to be normalized) however it
says...

  Note that the unparsed-text function does not normalize line
  endings. This example has therefore been written to recognize both
  Unix and Windows conventions for end-of-line, namely a single newline
  (#x0A) character or a carriage return / line feed pair (#x0D #x0A).

This explicitly leaves out Mac users and more importantly I think the
example ought to show the equivalent normalization to XML 1.0 (and
describe it as such, rather than by reference to specific operating
systems)
 
XML 1.0 (3ed)
http://www.w3.org/TR/REC-xml/#sec-line-ends
says

  To simplify the tasks of applications, the XML processor MUST behave as
  if it normalized all line breaks in external parsed entities (including
  the document entity) on input, before parsing, by translating both the
  two-character sequence #xD #xA and any #xD that is not followed by #xA
  to a single #xA character.


So I think that

<xsl:for-each select="tokenize(unparsed-text($in), '\r?\n|\r')">
 ...
</xsl:for-each>

would do the right thing, given tokenize's "first option wins" rule.

Of course XML 1.1 throws NEL and LS into the mix as well, but this is
just an example and  I think reference to XML 1.0 is sufficient (and
XML 1.1 should never have changed the white space rules:-)

David

________________________________________________________________________
This e-mail has been scanned for all viruses by Star. The
service is powered by MessageLabs. For more information on a proactive
anti-virus service working around the clock, around the globe, visit:
http://www.star.net.uk
________________________________________________________________________

Received on Wednesday, 24 November 2004 12:42:32 UTC