Re: NEL

Rick Jelliffe wrote:


> XML's rules
> are aimed at trying to be consonant with HTTP 1.1, which says clearly that
> the MIME rule for text/* is CRLF, but that HTTP allows relaxing of this. XML
> supports HTTP's relaxing, and so allows a multiplicity of mappings.


Yes.  The line-end function can be encoded by a CR character, a LF
character, or a CR character followed by a LF character.  Furthermore,
the representation of CR and LF need not be octets 0x0D and 0x0A
respectively (for example, in UTF-16(xx) encodings, they are not).


> What seems quite clear from that passage is that, due to requirements
> inherited from
> HTTP, the responsibility for mapping from non-CRLF line breaks to
> CRLF line breaks (as required by  )
> is the responsiblity of the sending system.  Not the receiving XML
> processor.


Not so.  HTTP will cheerfully transport bare CRs or LFs, and it
is the responsibility of XML processors to do the mapping on input
to the XML canonical form, which is bare LF.  See 2.11 of the
XML Rec.

>    MIME requires that an Internet mail entity


Note the word "mail"; we are now in the original scope of MIME,
which is bodies of mail messages.  When you mail XML around,
you must use CR+LF as the line end indicator.

>    The canonical form of any MIME "text" subtype MUST always represent a
>    line break as a CRLF sequence.  Similarly, any occurrence of CRLF in
>    MIME "text" MUST represent a line break.  Use of CR and LF outside of
>    line break sequences is also forbidden.


This rule again applies only to mail bodies.


> So the IBM character MUST NOT be used as a replacement for CRLF,
> as a line break. If it is serving as a replacement is MUST be mapped at
> the server end.


Only if the server is using mail, rather than HTTP, to transmit the XML.

> I thought the proposal was to allow NEL as a distinct character from CRLF
> to also act as whitespace. This is different from replacing it with LF.


There is no proposal to change the definition of S (production rule 3,
section 3.2).  The proposal is to change 2.11 so that NELs and perhaps
LSes (U+2028) are mapped to #xA characters in advance of all other
processing.

-- 
There is / one art             || John Cowan <jcowan@reutershealth.com>
no more / no less              || http://www.reutershealth.com
to do / all things             || http://www.ccil.org/~cowan
with art- / lessness           \\ -- Piet Hein

Received on Monday, 9 July 2001 15:51:31 UTC