Re: End-of-Line Handling clarification

Michael Siegel scripsit:

> It is unclear how the "application - XML processor" relationship can
> be applied to the client server model, where a server and client are
> only communicating with well-formed XML documents.  In this model, it
> is possible that the recipient of an XML document is interpreted as
> the "application" and the sender acts as the "XML processor".  Another
> interpretation is for the server and the client to be both application
> and XML processor.

In the XML Recommendation, "XML processor" means "XML parser".  So in
your situation, both server and client have an XML processor (parser)
which examines the incoming stream of bytes and provides it to the rest
of the server or client, as the case may be, through some API such as SAX,
DOM, STaX, or whatever.

> Our scenario is this:  A server has responded to a client application
> with an XML message (over HTTP) that contains sequences of '\r\n'
> characters intended to signify an end-of-line.  Should this XML
> document be considered well-formed even though it contains '\r\n'
> characters?

Yes, absolutely.  Well-formed XML documents may use any of \r, \n, \r\n,
U+0085, or U+2028 as line terminators.

> More specifically, does the client application need to pre-process
> the XML document and convert all '\r\n' characters to '\n'?

No, not at all.  What is meant is that the parser's API must report
all line terminators uniformly as \n.  This isolates the rest of the
application ("the application", in XML standardese) from the essentially
uninteresting differences in line terminators.

-- 
Some people open all the Windows;       John Cowan
wise wives welcome the spring           cowan@ccil.org
by moving the Unix.                     http://www.ap.org
  --ad for Unix Book Units (U.K.)       http://www.ccil.org/~cowan
        (see http://cm.bell-labs.com/cm/cs/who/dmr/unix3image.gif)

Received on Tuesday, 4 April 2006 12:12:27 UTC