RE: Canonicalization of <SignedInfo> for Reference Validation from Joseph M. Reagle Jr. on 2001-07-06 (w3c-ietf-xmldsig@w3.org from July to September 2001)

From: Joseph M. Reagle Jr. <reagle@w3.org>
Date: Fri, 06 Jul 2001 14:51:16 -0400
To: "Dournaee, Blake" <bdournaee@rsasecurity.com>
Cc: "'John Boyer'" <JBoyer@PureEdge.com>, w3c-ietf-xmldsig@w3.org
Message-Id: <4.3.2.7.2.20010706134538.00b93ec0@localhost>
At 23:00 7/5/2001, Dournaee, Blake wrote:
>I'm guessing that all of this was hashed out some time ago and I probably
>came a little late for this discussion. Any thoughts or pointers to previous
>discussion on this would be most helpful.

These are good questions actually and in the past this was part of our 
discussions/debate regarding the transforms [1] in 1999 and something John 
called "Document Closure." From that argument I don't think we ever 
extracted a ringingly crisp and clear consensus, but fortunately in the 
ensuing year of work on the spec I think we've ended up with a clear 
position none-the-less: "When transforms are applied the signer is not 
signing the native (original) document but the resulting (transformed) 
document. (See Only What is Signed is Secure (section 8.1).)"[2]

[1] http://lists.w3.org/Archives/Public/w3c-ietf-xmldsig/1999OctDec/0372.html
[2] 
http://www.w3.org/Signature/Drafts/xmldsig-core/Overview.html#sec-Transforms

I believe originally folks had three understandings of how transforms worked:
1. One is signing the resulting document.
2. One is signing a relationship between the source document, the 
transforms, and the resulting set of possible documents -- akin to John's 
document closure.
3. One is signing the original document even given arbitrary transforms.

Option 1 is supported by the spec; option 2 is a different way of thinking 
of the same thing but with some inferences an application might make (at its 
own peril) about the relationship between the source and original document: 
if the transforms are "safe" the application *might* make some inferences 
about their relationship -- but that's outside the scope of this spec and it 
can be dangerous. Option 3 can be a common mistake but is wrong.

To illustrate these options via a few examples:

A. SOURCE ENCODED
<Reference URI="source.base64">...
         <Transform Algorithm="&dsig;base64">

1. One is signing the decoding of the base64 encoded source. (Correct!)
2. One is signing a class of documents that result from applying a base64 
decode to the source file. (One-to-one mapping, and the result set only 
includes one document and is similar to option 1).
3. One is signing the base64 document. (Wrong)


B. SOURCE ENCODED
<Reference URI="source.xml">...
         <Transform Algorithm="&dsig;base64">

1. One is signing the encoding of the xml document. (Correct!)
2. One is signing a class of documents that result from applying a base64 
encode to the xml file. (One-to-one mapping, and the result set only 
includes one document and is similar to option 1).
3. One is signing the source.xml . (Wrong, and dangerous because since the 
application/user is obligated to see what it signs, it's seeing the base64 
version. Folks make this same mistake [3] with encryption, thinking that if 
one signs encrypted data, one is signing the source data and they also 
confuse the issue by lumping in implicit signature semantics. The correct 
response is you only are signing the plain form when you use a decryption 
transform to make it the final result![4])

[3] http://lists.w3.org/Archives/Public/xml-encryption/2001Jul/0010.html
[4] http://www.w3.org/TR/xmlenc-decrypt


C. SOURCE TRANSFORMED WITH XPATH
<Reference URI="source.xml">...
            <Transform 
Algorithm="http://www.w3.org/TR/1999/REC-xpath-19991116">
              <XPath xmlns:dsig="&dsig;">
                   -select-everything-but-user-entry-data-fields-
             </XPath>

1. One is signing the transformed xml document. (Correct! Changes to the 
user entry elements that are no longer in the resulting document aren't 
signed obviously.)
2. One is signing the resulting document and its relationship (via 
transforms) to the source document. So, if the transforms are well 
specified, standardized, and understood while the application signed (and 
should "see") the resulting document, it might also continue to "see" the 
original document in the context of that relationship. So for instance, 
consider a multi-party work flow in which a document is canonicalized every 
time some processing and signature validation occurs. However, the 
processors don't want to pass on the canonical form because it is ugly for 
their purposes. Since the relationship between the original document and its 
resulting canonical form is well understood, they might validate the 
signature over the canonical form, but continue to process the original. 
They may wish to do the same thing with the form example above.
3. One is signing the source.xml . (Wrong, again.)

--
Joseph Reagle Jr.                 http://www.w3.org/People/Reagle/
W3C Policy Analyst                mailto:reagle@w3.org
IETF/W3C XML-Signature Co-Chair   http://www.w3.org/Signature
W3C XML Encryption Chair          http://www.w3.org/Encryption/2001/
Received on Friday, 6 July 2001 14:51:25 UTC