Document signing requirements

Should be easy to determine what is signed.

The current Transform chain mode is very procedural - it is like saying: run this code and at the end of it you get a byte stream which you can digest. In a WebService usecase, the server can works in a paranoia mode - it will not want to run any sequence of transforms that it doesn't understand. What would be nice is if instead of transforms the signature were more declarative which clearly separates out selection from canonicalization. For example it could list out all the URIs, ids, or included xpaths, excluded xpaths of the the elements that are signed. Then it could list the canonicalization methods. This makes it easy for the verifier to at first inspect the signature to determine what is signed and compare against a policy. E.g. there might be a WSSecurityPolicy with an expected list of Xpaths. Only if this matches, will the verifier do the canoncalization to compute the digests.
To work around this issue some higher level specification put strict rules on the chain of transforms. E.g. Ebxml says that there shuld be exactly two transforms 1st Xpath, 2nd EnvelopedSig, SAML says there should be only tranform - EnvelopedSig transform. etc. All this point to underlying problem that is hard to know what was signed - only by controlling the chain of transforms can someone determine what is signed.
some combinations transforms make this very hard - e.g.if there is an tranform in the middle of the chain which cause xml parsing - (this will happen if one tranform emit binary and the next one expects nodeset), with this the original DOM tree is gone and it not possible to compare element in the new DOM tree with ones in the old one. Also the WSSecurity STRTransform is a transform that mixes up selection and canonicalization. The DecryptTransform is another which does a canonicalization internally.

High peformance - streamability

We should at least support a basic subset that does not require DOM. There are existing streaming xml signature implementations but they make some assumptions, it would be better to formalize all of these assumptions and requirements at the spec level, rather than leave it up to each implementation.
I want to distinguish between streamability and one-pass. In my mind two pass is also streamable, so we don't really have to go out of way to prohibit forward references. Also we can assume that the entire Signature element (assuming it is detached or enveloped signature) will be loaded up into a java/c++ object, so the order of the elements inside the Signature element does not affect streamability.
verification in particular cannot be 1 pass - let us say you have a signed 1GB incoming message, which you need to veirfy first and then upload to a Database. So you have to make two passes on this data - first pass to verify and second pass to upload to DB. You cannot combine these two into 1 pass because verification result is determined only after reading the last byte.
The main impediment to streamability is the transform chain, because many of the tranforms are defined on nodesets and nodeset requires a DOM. An XPath transform is the biggest culprit, there are many XPath expressions which cannot be streamed. we need to decide on a streamable subset of XPath
Filtering of XPath namespace nodes is an esoteric thing - in the early interops there used to be Y4 test vector which had tests around this. Many implementations do not support this or support it conditionally. This feature slows down implementations dramatically, and also makes denial of service attacks much easier.

Better canonicalization

Canoncalization problems causes too many signature breakages
whitespaces in element content being signficant suprises lot of people - maybe we could have a flag - no mixed content
line breaks and white spaces in base64 encoded content causes problems too
prefix names being signficant is yet another source of issues.
One idea - is to use multiple digest values for one reference - one with each kind of canonicalization. E.g. canonicalizing with all spaces removed and all prefixes removed the digest value is YYY , but doing canonicalization the orginal way value is ZZZ. Now verifiiers can have more graded answer rather than a simple yes or no.

Pratik

Frederick Hirsch wrote:

Pratik

Would it be possible for you to provide the XML Security WG proposed requirements text based on your thinking on transforms and the constraints associated with them?

If you were able to do this before the F2F that would be very helpful, though I can understand if you cannot.

Thanks

regards, Frederick

Frederick Hirsch
Nokia