RE: Re-posting of "what is minimal caonicalization" notes

From an implementors point of view, I see a difference between rules
convenient for parsed XML (such as the signature itself and any embedded
objects) and referenced external objects (which may be of any type -- and
may not be well formed even if they are XML).

For the signature and any embedded objects, which even a minimal application
must parse in order to verify the signature, it seems most convenient to use
the same normalizations required of XML processors:

* [1] 2.11 says to normalize line endings to 0xA.

* [1] 2.10 says to "pass all characters in a document that are not markup".
I don't think that includes non-significant whitespace inside start/end
tags, but I'm not fluent enough in SGML lingo to be sure.

This would allow an implementation to easily work with parsed data.

I would also throw in normalization to UTF-8, and removal of the encoding
pseudo-attribute, since it makes it much easier to pass strings around in
existing code that is expecting ASCII (but I realize that is a biased
viewpoint).

For external objects, even if they are XML, it seems most convenient to use
something like the S/MIME rules:

* Normalize line endings for text/* content to 0xA and treat everything else
as binary data.

This avoids the need to parse any external XML content that is being signed,
unless some transformation is specified.

-Greg

[1] http://www.w3.org/TR/1998/REC-xml-19980210

Received on Thursday, 7 October 1999 17:55:17 UTC