W3C home > Mailing lists > Public > www-xml-xinclude-comments@w3.org > January 2005

Re: Normalize newlines when parse="text"?

From: Mike Brown <mike@skew.org>
Date: Fri, 21 Jan 2005 12:00:56 -0700 (MST)
Message-Id: <200501211900.j0LJ0usC033243@chilled.skew.org>
To: daniel@veillard.com
CC: Mike Brown <mike@skew.org>, www-xml-xinclude-comments@w3.org

Daniel Veillard wrote:
> XInclude states:
> 
>   http://www.w3.org/TR/xinclude/#text-included-items
> ------
>   Each character obtained from the transformation of the resource is
>   represented in the top-level included items as a character information
>   item with the character code set to the character code in ISO 10646
>   encoding, and the element content whitespace set to false.
> ------
> 
> Both character of code point 0xa and 0xd are in the range allowed by
> the Char production of the XML spec and won't raise errors.

Thanks; I saw the same sections you did, but I also saw in the Infoset spec:

-  appendix B "XML Reporting Requirements (informative)"
       item 3 "An XML processor must normalize line-ends to LF 
               before passing them to the application (2.11)."

-  appendix D "What is not in the Information Set"
       item 9 "The difference between CR, CR-LF, and LF line termination."

So it seems that the intent is for the Information Set to be constrained to 
XML's restrictions w.r.t. newlines -- an XML parser must report normalized 
newlines to the application, and the infoset is a model of what the parser 
reports to the application.

If that is the case, then an infoset 'transformation' like XInclude, while not 
explicitly requiring newline normalization, might be expected to normalize 
newlines anyway.

That's why I am asking for clarification.

-Mike
Received on Friday, 21 January 2005 19:01:08 GMT

This archive was generated by hypermail 2.2.0 + w3c-0.30 : Thursday, 9 June 2005 12:16:10 GMT