Re: Initial draft note for Transform processing from Pratik Datta on 2008-12-05 (public-xmlsec@w3.org from December 2008)

From: Pratik Datta <pratik.datta@oracle.com>
Date: Fri, 05 Dec 2008 15:10:42 -0800
To: Frederick Hirsch <frederick.hirsch@nokia.com>
CC: XMLSec WG Public List <public-xmlsec@w3.org>
Message-ID: <4939B4F2.8040705@oracle.com>
I want to make some small changes to requirements section, but it is in 
the design section where I have a completely different thing in mind.

Why should we have transforms at all? The word "Transform" indicates a 
processing step, and I would like XML signatures do be more 
"declarative" and leave the processing to the implementation. This is 
what I imagine each reference to be.

<Reference>
  <WhatIsSigned>
  </WhatIsSigned>

  <HowItIsCanonicalized>
  </HowItIsCanonicalized>
</Reference>

(Note: I am not really proposing this syntax - I know that if v2.0 is 
completely different from v1.1 people won't adopt it. But for now lets 
ignore that problem, and assume that we are starting from scratch)

As the name implies, WhatIsSigned just indicates what is being signed. 
 
<WhatIsSigned 
   type="xml"
  URI=" ..."
  includedXPath=".."
  excludedXPath="..."
  reincludedXPath="..."
 
  envelopedSignature="true/false"/>


Note
* This syntax is equivalent to an XPath transform and an 
EnvelopedSignature transform. The key difference is that there is no 
ordering of the transforms. The Signature is not giving a sequence of 
steps to the SignatureProcessor and asking it to perform them, instead 
it is specifying the intent of the signature.  This makes it easier for 
a Policy processor to determine what is signed. With the transform 
approach - suppose there are three transforms t1, t2, and t3; one has to 
actually execute the transforms to determine what was signed, but in 
this approach it is readily apparent

* reincludedXPath comes from the discussion with John Boyer of IBM. In 
that email chain with him, we concluded that all we want is first an 
includedXPath to select nodes, then an excludedXPath to take away some 
nodes from the original selection, and finally a reincludeXPath to put 
back some nodes that were taken away.

* This mechanism takes away a lot of variability which makes the 
signatures more secure and robust. For example envelopedSignature is now 
just a true/false attribute, so you cannot have two enveloped signature 
transforms (which to me is completely meaningless). Also you cannot do 
tricky things like have Xpath -> EnvelopedSig -> Xpath, where the second 
XPath brings back the enveloped signature, which was removed by the 
EnvelopedSig transform. Or you can't go from xml -> binary->xml

* The "type" attribute: In a transform chain you do not know if you are 
signing xml or if you are signing binary or something else, unless you 
run through the transforms. For example if you have an external URI 
reference and no transforms, then the data is interpreted as binary. But 
if you have an external URI references followed by an XPath transform, 
then the data is interpreted as XML. This can be very confusing to a 
policy processor. Instead it is better if a type attribute clearly 
specifies what is being done - we can have two different types:  
type="binaryFromURI" to mean directly fetch binary from an external URI 
and type="binaryFromBase64Nodes" to mean  use URI and XPath to identify 
text nodes, and then base64decode them and get the binary from them.

<WhatIsSigned
  type="binaryFromURI"
  URI="..."
  byteRange="0-20,220-270,320-"
/>

<WhatIsSigned
  type="binaryFromBase64Nodes"
  URI="..."
  includedXPath=".."
  excludedXPath="..."
  reincludedXPath="..."

  byteRange="0-20,220-270,320-"
/>

Note I have incorporated Chris Solc's byte range transform as one of the 
attributes.


The "type" atttribute will be the cornerstone of extensibility. Instead 
of adding new transforms, higher level specs would add new types, and 
then define new attributes/subelements that are valid for that type.
For Example the WS-Security SWA profile defines these transforms - 
AttachmentContentOnly and AttachmentComplete. In the new syntax this 
would be represented as as a type

<WhatIsSigned
  type="binarySoapAttachmentContentOnly"
  URI="cid:.."
/>

Similary Widget framework could define a type called "widget" and define 
new attributes for it.

"dbRecords" could be yet another type :  Konrad's example.


XSLT transform has potential security problems. But I see Konrad's point 
- if there is a set of well known XSLT files which the signer and the 
verifier both know about, then it perfectly ok to use them. But in that 
case maybe it could be represented as a custom attribute. Let us say 
there is a well know XSLT that transforms the XML to a displayable 
format - one could define an attribute convertToDisplayable=true , this 
will indicate that the verifier has to run this particular XSLT 
transform. However the policy processor does not need to run that.

One can argue that if you change the order of operations and verify the 
SignedInfo, then what is harm is running the XSLT . But I think it is 
still very risky. Consider this analogy - someone knocks on your door, 
you open it, there is somebody selling vaccum cleaner, you ask him to 
show his ID, and if that matches you listen to benefits of that vacuum 
cleaner, and then decide on whether you want to buy it or not.  The act 
of checking his ID is similar to checking the SignedInfo, it just means 
that the signer is who he says he is and so you are willing to listen to 
him; it doesn't mean that you trust him completely.  In a Web service 
scenario - the signature may be just based on the user's password, and 
during the message processing you will look up the type of the user - a 
sys admin, an employee, a contractor, an registerd external user - and 
decide how much to trust him.

Only in the case where the signer and the verifier are the same, e.g. in 
case of long term signature, then you can have absolute trust on all the 
transforms.

-----------------------------------

For the <HowItIsCanonicalized> section too, I am thinking of similar 
declarative approach

<HowItIsCanoncialized
   inclusive="yes"
   ignoreComments="yes"
   noMixedContent="true"/>

-------------------------------------------------

Pratik

Frederick Hirsch wrote:
>
> I created an initial draft for a note outlining requirements and 
> design for XML Signature transform processing simplification [1].
> I also incorporated some material from the mail list, including 
> material from Pratik and John Boyer, to help get us started.
>
> Please review and help fill in the gaps by proposing material to the 
> mail list.
>
> We may wish to consider an additional document focused on 
> Canonicalization v.next.
>
> This should complete ACTION-93.
>
> Thanks
>
> regards, Frederick
>
> Frederick Hirsch
> Nokia
>
> [1] http://www.w3.org/2008/xmlsec/Drafts/transform-note/Overview.html
>
>
Received on Friday, 5 December 2008 23:11:25 UTC