RE: minimal canonicalization from Greg Whitehead on 1999-10-12 (w3c-ietf-xmldsig@w3.org from October to December 1999)

From: Greg Whitehead <gwhitehead@signio.com>
Date: Tue, 12 Oct 1999 15:15:04 -0700
To: w3c-ietf-xmldsig@w3.org
Message-ID: <6B962A1EE646D31193270008C7A4BAB50933A3@mail.paymentnet.com>

S/MIME dodges this by relying on MIME's requirement that the sender, who
understands the local line ending format, translate to CRLF for transport
(rfc2049).

We can't dictate a transport encoding, since we are expected to sign content
by reference.  In theory, HTTP should probably require the rfc2049 canonical
encoding for text/*, but it doesn't seem to.

Anyway, I'd be interested to hear what FTP does, but I think the simple
algorithm
1) CRLF -> LF
2) CR (alone) -> LF
works for our purposes.

In fact, I think this is the algorihm implemented by most XML parsers, which
are required to normalize to LF endings [1].

Keep in mind that we're not modifying the content on disk (or on its way to
disk).  This is just part of the digest computation.

-Greg

[1] http://www.w3.org/TR/1998/REC-xml-19980210


-----Original Message-----
From: Peter Norman [mailto:pnorman@mediaone.net]
Sent: Wednesday, October 13, 1999 2:31 AM
To: w3c-ietf-xmldsig@w3.org
Subject: RE: minimal canonicalization


I looked at the sample minimal canonicalization code, and I think we may be
at cross purposes a bit. When we discussed this in Irvine, I thought we
were talking about the FTP/Telnet form of canonicalization, where a
platform line end is replaced by a canonical line end. This is not the same
thing as replacing CR's by LF's. With the first, if I FTP a document from
one platform to another and canonicalize it on each platform, I'll get the
same result, and in the other I won't. In cases where the sending end of
line convention is CR or LF and the receiving line convention is CR-LF or
LF-CR, the end of lines will all double. There is a reasonably
straightforward example of portable FTP style end of line code in the BSD
sources. It's only a few lines of code. 

Peter Norman, 
Factpoint

Received on Tuesday, 12 October 1999 18:15:30 UTC