- From: Aleksey Sanin <aleksey@aleksey.com>
- Date: Wed, 10 Apr 2002 21:31:03 -0700
- To: merlin <merlin@baltimore.ie>
- Cc: reagle@w3.org, w3c-ietf-xmldsig@w3.org
- Message-ID: <3CB51187.9010609@aleksey.com>
Hi, Merlin! I already agreed with you that "subtrees" solution seems to work faster. You are absolutelly right that I did no optimization at all (btw, it is an interesting question "what is it better to optimize: XPath Filter or the XPath engine used in the filter?"). And in my tests excluding childs in "subtrees" case takes much more time than calculating single XPath expression (the XPath expressions were similar and excluding N childs took N times more than selecting the node). If you still have documents and expressions you've used for tests I would like to try to evaluate these expressions using the test program I wrote. It'll be interesting to compare the results of XML/XPath engine I use (LibXML2) with your results. As an example when you do not want to sign subtrees, consider the following element <it:Item xmlns:it='http://example.org/it' Id='1231' price='12.23' > <it:Name>something</it:Name> <it:Description>Something really nice and usefull</it:Description> <it:Color>Blue</it:Color> ... </it:Item It's possible that the only information we really want to sign is the item Id and price. In "subtrees" case we need to select <it:Item /> first and next exclude all its childs. In the other case, everything is done in one step (and it's clear that it will be faster). From my point of view, it's very difficult to prove one point of view or another using any examples. I have feeling that real performance will significanty depend on a few different factors we don't know about (for example, XPath engine optimization). In my case, from the implementation point of view, providing "non-subtrees" option will cost nothing (assuming that "subtrees" option is implemented). On the other hand, this can give a simple solution to the case when "subtrees only" standard requires non-intuitive, tricky, multi-steps hack. Also to prevent "poorly-performing signatures" we can make a clear distinction by using different requirement levels (for example, MUST for "subtrees" and MAY for "non-subtrees") and probably put a note about possible performance drawbacks in using "non-subtrees" filter (remember, we assumed that before using signatures John Smith will read the standard :) ). Aleksey Sanin merlin wrote: >Hi Aleksey, > >We could go with a solution supporting both modes of operation; I'd >suggest, in that case, Filter=interect and Filter=intersect-subtrees. >However, before we go there, I would like to know what is the real- >world use case that you are trying to solve. > >You seem to be suggesting that selecting just an element in isolation >is a valid use case. Consider: > > <cc:CreditCard xmlns:cc='http://example.org/cc' Type='amex'> > <cc:Number>...</cc:Number><cc:Expires>...</cc:Expires> > <cc:AuthCodeToBeFilledIn>...</cc:AuthCodeToBeFilledIn> > </cc:CreditCard> > >I can see that it might be useful to sign this credit card, less the >AuthCodeToBeFilledIn element. I cannot see the use case for signing >something like <cc:CreditCard />, which is the case that you have >repeatedly brought up. > >I don't have much freedom to play with this (hence the current time), >so you will have to forgive my arbitrary statistics, but I quickly >ran a benchmark of using an intersect followed by a subtract filter >to select an element, less certain children (e.g., the above), versus >a pure XPath node-set intersect filter where I used a single XPath >expression to achieve this same effect (using a predicate). The pair of >subtree filters was, again, significantly (2x) faster than the node-set >XPath option. Now, it would seem to me that the only real advantage >of a node-set XPath option would be to perform this type of operation. >Indeed, the very act of making it available would suggest that this is >one of its purposes; but it is not the ideal choice for doing this. If, >alternatively, the purpose of the node-set XPath is that people will be >'familiar' with it, then their familiarity will breed poorly-performing >signatures. > >So, as far as I can see (and it seems that this is confirmed by your >tests), our subtree formulation speeds up all the meaningful use cases; >and I'd imagine that you didn't even optimize your code for common subtree >operations. If you can present real use cases where subtrees don't do >the job, then I'll gladly advocate a more flexible/complex approach; >I simply have yet to encounter the problem that you are solving. > >Merlin > >r/aleksey@aleksey.com/2002.04.10/15:24:23 > >>There were no replies to my suggestion [1] about an option to process >>nodes instead of subtrees in XPath Filter 2.0 Transform so I am rasing this >>question again. I understand the performance concerns (and I do see small >>performance decrease in my tests).However, I believe this will add >>flexibility >>and provide better way to select "complicated" parts of XML document. >> >>Aleksey Sanin. >> >> >>[1] >>http://lists.w3.org/Archives/Public/w3c-ietf-xmldsig/2002AprJun/0040.html >> >>Joseph Reagle wrote: >> >>>3. XPath Filter 2.0 Transform >>> >>>Thanks to this week's edits by Merlin (and discussion between Merlin, John, >>>Aleksey, and Christian) I think we've converged on a design and text that >>>we're happy with. If no one objects to the present text [2] (speak now!) >>>I'll stage it for publication as a W3C Last Call Working Draft (and first >>>"official" W3C WD) at the end of this month. >>> >> > > >----------------------------------------------------------------------------- >The information contained in this message is confidential and is intended >for the addressee(s) only. If you have received this message in error or >there are any problems please notify the originator immediately. The >unauthorised use, disclosure, copying or alteration of this message is >strictly forbidden. Baltimore Technologies plc will not be liable for >direct, special, indirect or consequential damages arising from alteration >of the contents of this message by a third party or as a result of any >virus being passed on. > >This footnote confirms that this email message has been swept for Content >Security threats, including computer viruses. >http://www.baltimore.com >
Received on Thursday, 11 April 2002 00:31:58 UTC