One of the assumptions/requirements was
9. Signing can be performed on arbitrary node sets.
Canonicalization of arbitrary nodesets introduces a lot of
complications. I would like to step back and see if we really require
it. The main requirement that I see is that we need to sign a fragment
of an XML document, and a nodeset lets us define an arbitrary fragment.
But nodesets have the following problems. I would like to see if we can
have an alternative way to identify what was signed without using a
nodeset, or maybe use very restrictive nodeset.
Problem 1) Nodesets introduces unwanted complexity with namespaces,
Nodesets follow the XPath Data model, which is slightly different from
the DOM model. One main area of difference is Namespace Nodes. In DOM
namespaces are just regular Attributes, but in XPath model these are
special kind of nodes. Also the Namespace nodes need to be expanded out
for every element.
e.g. if the original document is like this
<e1 ns1="n1" ns2="n2">
<e2>
<e3/>
</e2>
</e1>
In Xpath model all namespaces are expanded out for every node, i.e. it
becomes like this
<e1 ns1="n1" ns2="n2">
<e2 ns1="n1" ns2="n2">
<e3 ns1="n1" ns2="n2"/>
</e2>
</e1>
An XPath filter can remove certain namespace nodes, e.g. it can remove
the ns1 node from e2
<e1 ns1="n1" ns2="n2">
<e2 ns2="n2">
<e3 ns1="n1" ns2="n2"/>
</e2>
</e1>
This is very unnatural in XML 1.0, (it could be considered similar to
namespace undeclaration of XML 1.1). In this particular case n2 is not
used, so its removal will affect inclusive c14n, but not exclusive.
c14n. However a nodeset can also remove namespace nodes that are being
used, which really makes it invalid XML. The canonicalization
algorithms need to worry about this kind of namespace removal, even
though it is completely meaningless.
Problem 2) Namespace nodes degrades performance significantly
Because namespace nodes are expanded for every node, the number of
nodes that the implementation has to deal with increases very
significantly. Lets say there are 10 namespace nodes defined at the
top level which is a pretty reasonable number for SOAP messages. Then
the number of namespace nodes is 10 x number of elements. If each
namespace node is a java object, that is a lot of objects and a lot of
unnecessary temporary memory. I know that some implementations avoid
namespace node expansion for this performance issue.
This nodeset expansion is the basis for one the denial of service
attacks in the best practices document. In that example I made 100
namespace nodes and 100 elements, which means there are 10,000 nodes.
Then I wrote an xpath expression which counts all the nodes, since this
xpath is executed for every node, the number of iterations is 10,000 x
10,000 = 100million. If the namespace nodes were not expanded for
every node, then there would be only 200 x 200 = 4,000 iterations.
Problem 3) NodeSets make it hard to understand what is signed
In a WS Security use case, the verifier has a list of things that it
expects to be signed (as defined in a WS Security Policy), and wants to
make sure that they are really signed. While a nodeset is the most
generic form of representing an XML fragment, it is very hard to
reverse engineer. Most often the requirement is to sign a complete
subtree. A more complex use case excludes some descendant subtrees.
So instead of representing an XML fragment by a nodeset, I would like
it to be represented like this
* List of included elements:
* List of excluded elements (optional)
Exclusions override inclusions. (This is somewhat similar to XPath
Filter 2 Transform, except that it is much simpler)
This would make it easy to understand what was signed, I could just
compare the included elements with the expected list of included
elements.
The best practices document talks about a node by node comparison that
can be done to determine what is signed, but that is very expensive,
since you have visit all descendant nodes of a subtree to make this
comparison.
Problem 4) Nodesets imply DOM
A nodeset is not complete information, it always needs a backing DOM.
This makes it very hard for Streaming XML implementations -
SAX/StaX/XMLReader which are much more performant.
In my streaming presentation (http://www.w3.org/2007/xmlsec/ws/slides/12-mishra-oracle/),
I had talked about using an alternative representation of "XML
Events".
E.g. the document <e1><e2/></e1>
in nodesets is two nodes in a nodeset representation - e1, e2
but is 4 events in a streaming representation begin(e1), begin(e2),
end(e2), end(e1)
Summing up, nodesets are complex and slow, and full support of nodesets
is not a requirement. We can still use nodesets, but put some
constraints around it, to solve the above problems.
Pratik