W3C home > Mailing lists > Public > public-qt-comments@w3.org > August 2004

[F&O] line ends in unparsed-text()

From: David Carlisle <davidc@nag.co.uk>
Date: Mon, 23 Aug 2004 15:17:13 +0100
Message-Id: <200408231417.PAA16692@penguin.nag.co.uk>
To: public-qt-comments@w3.org

unparsed-text() can not be used to input arbitary byte streams as the
resulting string needs to conform to the character restrictions in the
data model, and the input is subject to character encoding which may
change the bytes anyway. So it is principally useful for "text files"
(as its name suggests). However one of the main distinguishing features
of text files is that their line endings are platform-dependent and
unparsed-text() (unlike an XML parser, and so the doc() function) does
not take account of this.

This means that given a file test.txt


on Windows (to take a specific example)

<xsl:value-of select="unparsed-text('test.txt','UTF-8')"/>

will produce


(I am using  @ rather than & here  in case the character references get
lost in some mail reading programs or translation to html in the

One can avoid this by going, for example


to get rid of the ^M characters, but then if the whole thing is run on a
Mac, you'd get
so to get a reliable cross platform result you will need to use
something like


which isn't exactly difficult but

a) it's a pain to have to do this every time
b) People developing on Unix won't notice the problem, so are liable to
   use unparsed-text() directly and will find the stylesheets producing
   strange white space errors when run on other platforms.
c) It's going to be an endless source of confused questions on user

Any chance that unparsed-text could _always_ do line end translation to
#10 modelled after XML parsers, just as it always does character
encoding handling.


This e-mail has been scanned for all viruses by Star Internet. The
service is powered by MessageLabs. For more information on a proactive
anti-virus service working around the clock, around the globe, visit:
Received on Monday, 23 August 2004 14:18:13 UTC

This archive was generated by hypermail 2.3.1 : Wednesday, 7 January 2015 15:45:20 UTC