- From: Frederick Hirsch <frederick.hirsch@nokia.com>
- Date: Wed, 3 Sep 2008 17:13:45 -0400
- To: "ext Sean Mullan" <Sean.Mullan@Sun.COM>
- Cc: Frederick Hirsch <frederick.hirsch@nokia.com>, Pratik Datta <pratik.datta@oracle.com>, public-xmlsec@w3.org
+1 might be less confusing, possibly simpler. regards, Frederick Frederick Hirsch Nokia (not as chair) On Sep 3, 2008, at 4:59 PM, ext Sean Mullan wrote: > > Hi Pratik, > > Nice writeup, thanks. > > One question I would pose is why do we necessarily have to use > XPath and try to work around it with all sorts of restrictions? Why > not just come up with something completely different that does what > we want, and only what we want, for example, the NodeSelection > Transform. > > From an implementor's point of view, I think that there won't be > any/many available XPath library that I will be able to use that > supports streaming or if there is it will not be a good fit for > what I need, as you mention below with the ones that you have > studied. That concerns me and I think would affect deployment for > other implementations as well. We have to try to make it easier to > implement XML Signature. So I would ask if it is essential to use > XPath? What benefits are we getting by sticking to XPath and not > just coming up with a new and simpler Transform that cuts out all > the complexity? Maybe for the XPath expression syntax (although > personally I always found XPath expressions difficult to decipher > without diving into the specification)? Or is it mainly to try to > be compatible with existing implementations? And is that a > realistic requirement? > > Thanks, > Sean > > > Pratik Datta wrote: >> This proposal is modifies the Transform to address the following >> requirements: >> Requirements >> ------------ >> 1) Check what is signed: >> Looking at a signature, it should be possible to find out what was >> signed. This is one of the best practices for verification. A >> receiver must not blindly verify a signature without at first >> checking if what was supposed to have been included in the >> signature is really signed. >> 2) Support Streaming >> Currently many transforms are based on nodesets, and nodesets >> imply DOM, and DOM requires the whole document to be loaded in >> memory which is bad for performance >> Change 1: Distinguish between selection and canonicalization >> transforms >> --------------------------------------------------------------------- >> -- >> To support the "check what is signed" requirement, we need to >> distinguish between transforms that select data to be signed, and >> transforms that convert that data to bytes. >> Selection Transforms: XPath Filter, XPath Filter 2.0, Enveloped >> Signature, Decrypt Transform Canonicalization Transforms: C14n, >> exc-C14N, base 64 >> XSLT transform can be used for anything, e.g. there could be a >> XSLT transform to remove white spaces, then this particular XSLT >> transform would fall in the canonicalization bucket. >> The WS-Security STR Transform does both Selection and >> Canonicalization. WS Security SWA attachment transforms do >> selection. >> Change 2 : Limit transformation sequence to selection first, >> canonicalization second >> --------------------------------------------------------------------- >> --------------- Currently there is no limitation on the ordering >> of transforms, so somebody could create a signature with >> c14n, xpath >> According to the processing rules, this means that reference URI >> is resolved and canonicalized into a octet stream, which is then >> reparsed into a xml, and then xpath is applied to select the >> nodes, after that another implicit c14n is performed to covert it >> into a octet stream. >> This is completely meaningless, and besides XML parsing is an >> expensive operation. So we would like to define a strict rules on >> the sequence of transforms >> * There can be no transforms after c14n (or after WS Security >> STRTransform which includes c14n transform) >> * No transforms after base64 because it produces a octet stream, >> which is to be directly digested >> * Other transforms that emit octet stream (like the WS Security >> SWA Attachment transforms) should also be the last one >> * XSLT also produces an Octet stream, but that needs to be dealt >> differently because it is not canonicalized and cannot be digested >> directly - actually I would vote for removing XSLT transform >> completely, because first of all it is insecure - very easy to >> have DoS attacks, secondly it is completely unstreamable (unless >> we have a very restricted XSLT), thirdly it loses the original >> nodeset so makes it impossible to determine what was really signed. >> * XPath Filter or XPath Filter 2.0 should be the first transform, >> and there should only one XPath transform. >> * There can be only one enveloped signature transform >> * Only one Decrypt transform >> * Base64 transform should only take a single text node or an >> element with a single text node child as input. (This restriction >> is to eliminate dependency on the Xpath text() function, which is >> not streamable as it needs to select any number of text nodes and >> concatenate them) >> These rules eliminate XML Parsing during transform processing, and >> also make it possible to determine what is signed. >> Change 3: Use simple XPaths in XPath Transform >> ---------------------------------------------- >> XPath poses a lot of problems - first of all it is insecure - DoS >> attacks are possible, secondly XPath inherently requires a DOM, >> there is a only a limited set of XPath that can be streamed, >> thirdly XPath make is very hard to know what is signed, fourthly >> XPath Filter 1.0 are inside out and very difficult to write and >> understand (although this is fixed in XPath Filter 2.0) >> XPaths can also be specified in an XPointer URI, but since >> XPointers were marked OPTIONAL, but XPath Transform were marked >> RECOMMENDED, XPointers have never really been used. I propose that >> we just drop/deprecate them. >> To solve these XPath problems, I propose a new mechanism to to >> specify the XPath transform, which is essentially a restricted >> form of the XPath Filter 2.0. It has >> * an included Xpath - identifies subtrees that need to be signed >> (optional - an URI an can be used instead of this) >> * an excluded Xpath - (optional) identifies subtrees or >> attributes need to be excluded >> The included XPath is similar to the "intersect" and the excluded >> XPath is similar to the "subtract" of the XPath Filter 2.0. >> Restrictions >> * As mentioned above, if Xpath is used, it should be the first >> transform, (there can be only one Xpath transform in the transform >> list), >> * If included is used, the reference URI should be "", i.e. refer >> to the complete document >> * The XPath expression itself is very restricted as mentioned below >> * Unlike XPath Filter 2.0, there is only included XPath and one >> excluded XPath, and the excluded overrides included. >> I am open to the syntax, as long as we can have this included and >> excluded XPaths. One idea is to preserve backwards compatibility, >> and just add two attributes "included" and "excluded" to the >> existing XPath transform, like this: >> <Transform Algorithm="http://www.w3.org/TR/1999/REC-xpath-19991116"> >> <XPath included="..." excluded="..."> >> ... >> </XPath> >> </Transform> >> So an older implementation will execute the XPath Filter 1.0 >> Transform, whereas a newer implementation will just process the >> included and excluded XPaths. >> This proposal also makes it easy to determine what is signed. >> There is only Xpath transform, and this Xpath has only one >> included XPath, so it is easy to to do static analysis of the >> signature to determine what elements were signed. >> Streaming XPath >> --------------- >> There are many streaming Xpath implementations, and they impose >> different kinds of constraints on the XPath. >> I looked at XSQ implementation which Thomas had pointed out http:// >> www.cs.umd.edu/projects/xsq/. >> and some others >> http://www.idealliance.org/papers/xml2001/papers/pdf/05-01-01.pdf >> http://cs.nyu.edu/~deepak/publications/icde.pdf >> http://www.stanford.edu/class/cs276b/handouts/presentations/ >> joshislezberg.ppt http://www.idealliance.org/proceedings/xml04/ >> papers/299/RandomAccessXML.pdf >> They have varying constrains some common ones are >> * Only forward axes - like child, descendant, forward-sibling >> (reverse axes are very difficult) >> * lot of limitations on predicates >> ** no location paths in predicates >> ** no nested predicates >> ** functions on nodesets are not allowed e.g count(), last() etc >> ** conversion of subtrees to strings e.g. the text() functions >> Even with these restrictions, the implementations are very complex >> and require state engines and static optimization >> I would like to propose an smaller subset of the XPath, that has >> even lesser requirements. For this imagine a streaming XML Parser >> that is walking through the XML tree, and any point it has in memory >> * the current element, >> * all the attributes of the current element, >> * all and ancestor elements >> We assume that this parser maintains a namespace definitions and >> also do xml:base combinations as it walks down the tree. >> Node Text nodes can be extremely long (especially for long base64 >> encoded string, e.. MTOM attachments), so it is possible that a >> text node is split up, and not loaded up all in memory. >> With this model, we impose the following restrictions >> * Only elements can be selected. (I.e. the location path must >> resolve to one or more elements. not attributes or text nodes) >> * Only descendant and child axes can be used >> * predicates can only have relational expressions involving >> attributes. The predicate can only be at the last location step, >> and it cannot use any functions. >> So only simple expressions like this are allowed >> /soap:Envelope/soap:Header[@actor = "ac"] >> This restrictions are such that the XPath expression can be >> evaluated with only the element, it attributes and its ancestor >> elements. So as a streaming parser is walking down the document, >> it can evaluate the included and excluded XPath expression for >> every node, and determine whether a node is to be included or not. >> Reference Processing >> ==================== >> These proposed changes allow the signature to be statically >> analyzed without running through the transforms. A signature >> processing API/Library should provide a method to statically >> analyze the reference and return what was signed. After that the >> caller of this library, can determine if it wants to go ahead with >> signature verification. >> Streaming verification >> ---------------------- >> These changes also allow signatures to be processed in a streaming >> manner. Let us assume that we have already done an initial pass >> over the document to get the signature, keys, tokens etc. (In >> WSSecurity use case, all of these are present in the SOAP header, >> so this first pass is just going only over a small fraction of the >> document, not the entire document). >> Now we set a "canonicalization and digesting engine" for each >> reference. This engine expects streaming xml events, and >> canonicalizes and digests them to maintain a running digest. Then >> we do one pass over the whole document, and for each node, >> evalulate all the XPaths/URIs for each references. If the node is >> part of a reference we pass that event to the corresponding >> canonicalization and digesting engine. >> After this pass, we retrieve the digests from each engine, and >> check if the digests match. >> Summary >> ------- >> The proposal puts in a lot of restrictions to the Transforms, to >> make it possible to check what was signed, and to perform signing/ >> verification operations in a stream. >> Pratik > >
Received on Wednesday, 3 September 2008 21:14:44 UTC