Re: About whitespace in ds:SignedInfo from Donald E. Eastlake 3rd on 2002-01-07 (w3c-ietf-xmldsig@w3.org from January to March 2002)

From: Donald E. Eastlake 3rd <dee3@torque.pothole.com>
Date: Mon, 07 Jan 2002 09:35:06 -0500
To: Steve Mathews <SMathews@conclusive.com>
cc: w3c-ietf-xmldsig@w3.org
Message-Id: <200201071435.JAA0000088555@torque.pothole.com>
Hi,

If you want to sign a character string so that it can't change, you
are welcome to do so. The question is, what is the something you sign?
XML isn't a character string. It is a very complex structure and many
aspects of its external character string representation are, by
definition, so insignificant that conformant XML paresers are not
allowed to communicate then to an application.  If you want to create
binary signatures on binary messages sent through binary channels,
where some of the data just happens to be XML, you can secure all the
bits. If you want to create XML signatures on XML sent through XML
channels, a signature securing all the bits is uselessly
brittle. That's why XML digital signatures, which by definition are in
XML syntax and are handled as XML, need various forms of XML
canonicalization.

So the "thing" you sign for XML has to be a canonicalization of the
XML or it isn't a useful XML signature.

Higher level trust and business aspect mechanisms are out of scope for
the XMLDSIG working group except possibly as a input to requirements
for the base cryptographic facility XMLDSIG provides. If you believe
that it needs to provide a means of "signing every bit" with no
canonicalization, your needs are met because it can be used that way.

Donald

From:  Steve Mathews <SMathews@conclusive.com>
Message-ID:  <5246684B3A2FD511BE2E00D0B7A9280E1A7601@mailgb01.gb.conclusive.com>
To:  "'Donald E. Eastlake 3rd'" <dee3@torque.pothole.com>
Date:  Mon, 7 Jan 2002 13:50:43 -0000 

>At the risk of going around the circuit (yet again?) of what the problem(s)
>is(are), the summary would look something like this:
>
>Once you apply a digital signature to something, that something cannot be
>allowed to change by so much as one bit if the signature is still to be
>meaningful.  Thus any and every possible way in which it might be changed as
>a result of 'automatic' or 'system embedded' means must be prevented.  If
>not it MUST be assumed that the content has been hacked and it MUST be
>regarded as flawed.
>
>Where legal contract is involved, if the information is to be used for
>contract, then the form and format of the whole of what makes up the
>contract must be bound.  It might be possible to argue that some data and
>some form, when combined in some way, were the basis of contract.  But if
>they are not completely bound together under a digital signature that can
>prove that neither form, format or data can have altered, then you are
>likely somewhere between a rock and a hard place.  The fact that a standard
>says, "Hey, you can do it this way," does not mean it will be sanctified by
>law.  The problems come with enforcement.
>
>This latter is a major gap between the need to 'sign' some data to protect
>it in transit or at rest, which are typical IT problems, and the need to
>sign something because it forms a contractual agreement.
>
>Thus, if all you are doing is some improvement to internal controls - you
>are the whole arbiter of what is right and what is wrong - you set up the
>architecture in which this is to work: you can do just what you want.
>
>If what you are doing involves other organizations you can also do whatever
>you jointly and severally agree to.
>
>If you are doing something else then you need to think profoundly about the
>business aspects of what you are doing, and how to make the technology fit
>into that.  It may require you to specify a small subset of a standard, or
>specify a very specific interpretation and implementation to achieve you
>business purpose.  Programming staffs may have to get comfortable with
>restrictions on what they can do and how they can do it.
>
>Kind regards
>
>Steve Mathews
>Conclusive Logic
>
>-----Original Message-----
>From: Donald E. Eastlake 3rd [mailto:dee3@torque.pothole.com]
>Sent: 07 January 2002 05:04
>To: Cherian, Sanjay
>Cc: 'w3c-ietf-xmldsig@w3.org'; 'galvin@eListX.com';
>'david@drummondgroup.com'; Damodaran, Suresh
>Subject: Re: About whitespace in ds:SignedInfo 
>
>
>Hi,
>
>There are several questions here.
>
>On Transforming SignedInfo:
>
>We deliberately made a general transform chain unavailable on
>SignedInfo, at least among the mandatory or recommended to implement
>algorithms, because it is a geneal Artificial Intelligence problem to
>figure out if such transforms are safe. Note that they could
>completely screw around with all your References, Signature Algorithm,
>and themsevles. So they can trivially arrange for the post-transform
>version signature to always succeed or always fail and to look
>generally secure. So, for any application allowing Transforms over
>SignedIfno to be secure, you probably need a complete description of
>all allowed sets of Transforms...
>
>Anyway, to discourage this, we permitted only CanonnicalizationMethod.
>But, of course, it is an algorithm so it can actually do
>anything. (IE, you could define a CanonicalizationMethod that took a
>Transforms as an explicit parameter.)
>
>On WhiteSpace:
>
>As I see it, if intermediaries are going to generally reformat things
>to be pretty, then it is kind of hopeless to try to be secure.
>
>I believe there are actually three types of white space. White space
>inside tags is truly insignificant. It isn't even handed to
>applications. Then there is white space in the content of something
>with element only content that appears between content elements or
>before the first or after the last content element. This is defined by
>the XML spec as "insignificant", although only a validating parser can
>tell that.  Nevertheless, such white space is required to be given to
>the application.  Since the application can do anything with it, that
>it is flagged as "insignificant" has no security meaning. And
>if you have a non-valiating parser or if the element has mixed
>content, then it doesn't even have this meaningless "insignificant"
>flag affixed to it. Finally, there is actual content white space,
>which can also be changed if some intermediate node pretty printed
><a>foo bar</a> as
><a>
>    foo
>    bar
></a>
>for example.
>
>Your suggestion:
>
>If you guys want to define a canonicalization function that deletes
>all pure white space text nodes and make support of it required for
>your application, you can do so.  But I'm not sure of your logic in
>doing so but not dropping leading and trailing white space and/or
>changing internal runs of white space to one space or something else
>canonical. And I don't see how this can be safe and secure. You may
>know the schema for SignedInfo but it is harder to know the schema for
>every parameter of every Transform someone might use.
>
>Donald
>
>From:  "Cherian, Sanjay" <Sanjay_Cherian@stercomm.com>
>Message-ID:
><40AC2C8FB855D411AE0200D0B7458B2B04D636BD@scidalmsg01.csg.stercomm.com>
>Date:  Thu, 20 Dec 2001 10:32:54 -0600
>
>>Hi,
>>
>>This could likely be a problem that has been debated a lot in the past but
>I
>>would appreciate if someone explained
>>(or pointed to a previous explanation on the mailing list or elsewhere)
>what
>>the current thinking is regarding this.
>>
>>The ebXML Messaging Services Specification (
>>http://www.ebxml.org/specs/ebMS.pdf ) uses XMLDSIG to sign portions
>>of the ebXML message (an XML structure based on SOAP) and the ds:Signature
>>element is embedded into the signed
>>ebXML message.
>>
>>The signed ebXML message is subject to modification, passing through a
>>sequence of intermediary SOAP and ebXML
>>processors before reaching the ultimate recipient, which also validates the
>>signature. One such modification that is difficult
>>to predict is change in whitespace (indentation, CRLF characters) in the
>>ebXML message. The ds:SignedInfo element
>>is itself subject to such modification (being part of the ebXML message).
>>
>>A proposed modification to the ebXML Messaging Specification suggests this
>>XSL transform in the ds:Reference element:
>>
>>    <Transform Algorithm="http://www.w3.org/TR/1999/REC-xslt-19991116">
>>      <xsl:stylesheet version="1.0"
>>xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
>>        <xsl:strip-space elements='*'/>           <!-- Strip trivial
>>whitespace in all elements. -->
>>        <xsl:template match='node()|@*'>       <!-- Identity transform. -->
>>          <xsl:copy>
>>            <xsl:apply-templates select='@*'/>
>>            <xsl:apply-templates/>
>>          </xsl:copy>
>>        </xsl:template>
>>      </xsl:stylesheet>
>>    </Transform>
>>
>>the signature has been made robust to changes in trivial whitespace in the
>>ebXML message but outside the embedded
>>ds:SignedInfo elements. However, because the ds:SignedInfo elements are not
>>processed by any XSL transform (but by
>>the algorithm specified in ds:CanonicalizationMethod, for which we are
>using
>>http://www.w3.org/TR/2001/REC-xml-c14n-20010315 ), changes in trivial
>>whitespace in ds:SignedInfo invalidates the
>>signature.
>>
>>To give some context on how ebXML Messaging uses XMLDSIG, I have included
>an
>>sample signature element (taken from
>>an ebXML example in the ebXML specification), later in this email.
>> 
>>The first guess would be that by using a 'smarter' canonicalization
>>algorithm for the ds:SignedInfo element, the ds:SignedInfo
>>element can also be made robust to changes in trivial whitespace.  Rather
>>than use the schema-unaware canonicalization
>>algorithm ( http://www.w3.org/TR/2001/REC-xml-c14n-20010315 ), it would
>seem
>>that a different canonicalization algorithm
>>could be written which, being aware of the XMLDSIG schema, could do a
>better
>>job of eliminating trivial whitespace.
>>
>>This 'smarter' CanonicalizationMethod algorithm might certainly not be
>smart
>>enough to always know whether a particular
>>whitespace character is significant or not. However, if such an algorithm
>>was created that simply removed all whitespace where
>>an element has textual content consisting entirely of whitespace (the kind
>>xsl:strip-space would remove), people that don't have
>>very sophisticated Transform elements (that could get mangled) would be
>able
>>to benefit from it.
>>
>>In other words, a 'smarter' XMLDSIG schema-aware CanonicalizationMethod
>>algorithm could be published with the caveat that
>>changes in whitespace in descendants of ds:SignedInfo would become
>>irrelevant.  In situations where this might be a problem,
>>the earlier schema-unaware canonicalization algorithm could be used
>instead.
>>
>>Just as an example, the stylesheet presented earlier, which is a parameter
>>to the ds:Transform element does not contain any
>>whitespace that needs to be preserved and so a CanonicalizationMethod
>>algorithm that removed all trivial whitespace would not
>>hurt us but actually benefit us a lot.
>>
>>Here is a typical Signature element, as it is used by the ebXML Messaging
>>Specification.
>>
>>		<Signature xmlns="http://www.w3.org/2000/09/xmldsig#">
>>			<SignedInfo>
>>				<CanonicalizationMethod
>>Algorithm="http://www.w3.org/TR/2001/REC-xml-c14n-20010315"/>
>>				<SignatureMethod
>>Algorithm="http://www.w3.org/2000/09/xmldsig#dsa-sha1"/>
>>				<Reference URI="">
>>					<Transforms>
>>						<Transform
>>Algorithm="http://www.w3.org/2000/09/xmldsig#enveloped-signature"/>
>>						<Transform
>>Algorithm="http://www.w3.org/TR/1999/REC-xpath-19991116">
>>							<XPath>
>>not(ancestor-or-self::()[@SOAP:actor=
>>	
>>&quot;urn:oasis:names:tc:ebxml-msg:actor:nextMSH&quot;]
>>	
>>| ancestor-or-self::()[@SOAP:actor=
>>	
>>&quot;http://schemas.xmlsoap.org/soap/actor/next&quot;])
>>							</XPath>
>>						</Transform>
>>						<Transform
>>Algorithm="http://www.w3.org/TR/2001/REC-xml-c14n-20010315"/>
>>					</Transforms>
>>					<DigestMethod
>>Algorithm="http://www.w3.org/2000/09/xmldsig#sha1"/>
>>					<DigestValue>...</DigestValue>
>>				</Reference>
>>				<Reference URI="cid://blahblahblah/">
>>					<DigestMethod
>>Algorithm="http://www.w3.org/2000/09/xmldsig#sha1"/>
>>					<DigestValue>...</DigestValue>
>>				</Reference>
>>			</SignedInfo>
>>			<SignatureValue>...</SignatureValue>
>>			<KeyInfo>...</KeyInfo>
>>		</Signature>
>>
>>I was just trying to convey my thoughts on this matter.  Thanks for taking
>>the time to read this email.
>>
>>Regards,
>>
>>Sanjay J. Cherian
>>Sterling Commerce
>>Irving, TX
>>
Received on Monday, 7 January 2002 09:38:00 UTC