- From: Johnny Stenback <jst@w3c.jstenback.com>
- Date: Wed, 17 Sep 2003 15:38:58 -0700
- To: Christian Parpart <cparpart@surakware.net>
- Cc: www-dom@w3.org
Christian Parpart wrote: > Hi, > > we got a serious problem on de.comp.text.xml about newline handling inside > XSLT. > > <xsl:text> > </xsl:text> > <xsl:text> </xsl:text> > <xsl:text> </xsl:text> > <xsl:text> /> > <xsl:value-of select="' "/> > <xsl:value-of select="' '"/> > <xsl:value-of select="' '"/> > > These are the 7 ways how to create a newline in the XSLT result tree. > > Now, why I am asking right here, is, because I wanna know how the DOMParser > (DOMBuilder) should handle theses character references inside text nodes and > inside attribute nodes, and the newline-literal shown first. > > The xml recommendation tells that a newline shall be always represented as > 0x10 literal and though be passed from the DOMBuilder to the application as > 0x10. But will all versions above really work? Newline normalization is always done before character entities are expanded, and what you get when that's performed, that's what you'll see in the DOM. > > Someone tested version 2, 3, and 4 with msxml, saxon, and libxml2/libxslt and > got very different results. > > Is newline normalization part of character normalization and though optional > or should it be performed *ALWAYS*? > > Should *ANY* newline variant be interpreted as the UNIX newline variant? > Or is this part of the DOMSerializer to perform the normalization of newlines > into the environment-specific newline form? > > The XSLT spec doesn't mention these cases above, the XML rec doesn't neither. > So, I hope this is part of DOM3 LS to specify how to build/serialize newline > characters ;) Unfortunately this is beyond the scope of the LS spec, the LS spec simply defines how to parse a document into a DOM structure and then serialize that structure out to a sequence of bytes, in one form or another. The fact that some information is lost in that process (due to XML 1.0 processing, and what not) is, and will remain, a fact. Newlines in some cases (as in some of the cases above), are only part of what's lost in this process, and this is not a problem the DOM WG is chartered to solve. > > Many many thanks, > Christian Parpart. > -- jst
Received on Wednesday, 17 September 2003 18:39:29 UTC