Re: Followup on I18N Last Call comments and disposition

     May I take it that Masahiro and Martin's concern, then, refers
primarily to the character normalization of XML markup rather than to the
values of fields?  Is it appropriate to make that explicit?  In any case,
the reference "not be done as part of a signature transform" may need to
refer to digests as well.

          Tom Gindin

"Joseph M. Reagle Jr." <> on 07/07/2000 09:33:51 PM

To:   Tom Gindin/Watson/IBM@IBMUS
cc:, "Masahiro Sekiguchi"
      <>, "Martin J. Duerst" <>,
      "John Boyer" <>, David Solo <>
Subject:  Re: Followup on I18N Last Call comments and disposition


I think your statements are exactly right, and I'm barking up the wrong
in speaking of collisions. However, I'm trying to address the point raised
by Masahiro and Martin:

   Assume that a document contains XML with element names with
   accented characters. Assume that this document is correctly
   normalized. Assume that the signature includes NFC as a transform.
   Now the following attack is possible: An intruder replaces the
   normalized document by a document with some of the element
   names unnormalized. The signature still works. However, an
   XML/DOM processor or an XPath expression may (and in practice
   will) work differently, because the unnormalized element is
   assumed to be different from the normalized one....
   and combine this with a DOM program that extracts the first
   <amount> and pays somebody that much. After the change by
   the intruder, the amount actually paid is $1000 instead of $10.


  **** E.g. in Section 8, at a convenient location (e.g. 8.1), add
        something like: Using character normalization (Normalization
        Form C of UTR #15) as a transform or as part of a transform
        can remove differences that are treated as relevant by most
        if not all XML processors. Character normalization should
        therefore be done at the origin of a document, and only
        checked, but never be done during signature processing.

Perhaps this is a problem is best addressed by the inverse of the
pre-existing rule of  "see what you sign" because:

1. I think I18N's concern is about an XML (DOM) processor operating over
pre-canonicalized XML document after the Signature processor has declared
the signature over its canonilized form as valid. (For instance finding the
first instance of some element where character normalization may change
is thought of the 'first element' but the Signature still validated.)
2. I don't think this concern is unfounded as we've (somewhat/sometimes)
expressed an expectation that processors won't operate over the
canonicalized form of the XML document. The C14N'ized XML is merely a
normalizing step prior to digesting. For instance, what if we had chosen to
design a canonicalization algorithm that did not output XML but a binary
format? Clearly, the XML processor is going to operate over the original
content and I18N's security concern is a valid one.
3. However, our expectations of C14N have changed in that we are using it
for document subsettting as will XML Query probably and the earlier
expectation was not the most secure.
4. Consequently we need to:
A. Ensure that DOM sees only what is Signed. This is our expectation with
XPath/XSLT and this should be no different. (We're getting close to
transform closure" issue where he wants to operate over the original XML
document though ensure that the transforms resulting in the final form
didn't introduce potential weakensses ((like character normalization).
B. State that the C14N transform is like any other transform and
canoniciization algorithms which yield binary results can be dangerous
because the result is not "seen".
C. Ensure that our own Signature Validator sees what was signed when it
validates the Signature. Consequently, I believe the Canonicalization of needs to happen BEFORE Reference Validation of .

Consequently, I've tweaked 3.2.1 Reference Validation

For each Reference in SignedInfo:
/+ 1. Canonicalize the SignedInfo element based on the
CanonicalizationMethod in SignedInfo. +/

AND section 8.1.3 "See" What is Signed (Do we still need the last

| Note: This new recommendation is actually a combination/inverse
| of the earlier recommendations and is still under discussion.

Just as a person or automatable mechanism should only sign what it "sees,"
persons and automated mechanisms that trust the validity of a transformed
document on the basis of a valid signature SHOULD operate over the data
was transformed (including canonicalization) and signed, not the original
pre-transformed data. Some applications might operate over the original
but SHOULD be extremely careful about potential weaknesses introduced
between the original and transformed data. This is a trust decision about
the character and meaning of the transforms that an application needs to
make with caution. Consider a canonicalization algorithm that normalizes
character case (lower to upper) or character composition ('e and accent' to
'accented-e'). An adversary could introduce changes that are normalized and
consequently inconsequential to signature validity but material to a DOM
processor. For instance, by changing the case of a character one might
influence the result of an XPath selection. A serious risk is introduced if
that change is normalized for signature validation but the processor
operates over the original data and returns a different result than

Consequently, while we RECOMMEND all documents operated upon and generated
by signature applications be in [NFC] (otherwise intermediate processors
might unintentionally break the signature) encoding normalizations SHOULD
NOT be done as part of a signature transform.


Received on Sunday, 9 July 2000 22:51:55 UTC