- From: Aleksey Sanin <aleksey@aleksey.com>
- Date: Tue, 19 Mar 2002 16:05:59 -0800
- To: merlin <merlin@baltimore.ie>
- CC: Christian Geuer-Pollmann <geuer-pollmann@nue.et-inf.uni-siegen.de>, John Boyer <JBoyer@PureEdge.com>, reagle@w3.org, w3c-ietf-xmldsig@w3.org
- Message-ID: <3C97D267.20804@aleksey.com>
My 2 cents: I do like the Merlin's proposal with intersect/subtract types of XPath. The current specification is not clear and when I did implementation I made this mistake and realised that I did it only after reading the long thread a week ago. I also have to say that in my particular case (probably it's not only my case) the result of XPath expression evaluation is actually a *set* of nodes and doing set operations is a natural and fast. Aleksey. merlin wrote: >Hi Christian, > >I don't believe that your proposal does anything (matching the >requirements of this particular document) that tree-based set >intersect/subtract does not do, it is no faster (it is basically >the same thing for the required operations), and it introduces new >and (imho) confusing terminology (given that we're working with node >"sets.") Admittedly, it allows operations such as exclude an element in >the middle of a tree, but include its children ("H" in your example), >but I don't think that these features are compelling. People can use >the old XPath if they need an operation as esoteric as that. > >I would side with John that the operations we need are including subtrees >and excluding subtrees, and that they need to be as fast as possible. I >would disagree with John that "include from whole document" and "exclude >from whole document" operations are the way to solve this. They are too >limited and are, in (my) practice, no faster than the fully general set >operational equivalents that I espouse (for the relevant cases). > >I believe that John's contention is that he cannot rely on all implementors >to perform set operations with sufficient speed. I don't think that this >is a reason to choose a technically inferior solution. In fact, I think >that if we support both set operations and "whole document" operations >then we are essentially giving implementors license to do the set >operations poorly. > >I think that we should define only the set operations, and should >clearly document the potential optimizations and that these operations >SHOULD be optimized by an implementation (e.g., performance SHOULD >be broadly linear in the size of the selected subtrees, not the input >document.) Implementors must then provide explicit justification for >NOT optimizing their code, so John will get a reasonable assurance >of speed and I'll get my set ops! > >Apologies, John, if I misrepresent you. > >Yes Joseph, I think a teleconference might be interesting! Not Friday. > >Merlin > >r/geuer-pollmann@nue.et-inf.uni-siegen.de/2002.03.20/00:23:40 > >>Hi Joseph, >>ho John, >> >>maybe a picture says more than thousand words ;-)) (I renamed the include >>to include-but-search to indicate that it has to be traversed). >> >>Given the tree in the picture. I arbitrarily selected some nodes (single >>nodes on the right side or contingous areas on the left) which are to be >>selected by this xpath transform. Black nodes are to be included in the >>output, white nodes are to be omitted. >> >>The root node A is labeled "exclude-but-search". This means that it is not >>to be outputted, but some descendants (probably) are. >> >>The black node B on the left side is labeled "include-but-search" because >>itself is to be outputted and it's descendants must be traversed to copy >>them to the result node set. >> >>The unselected child C is labeled "exclude" because itself and _all_ it's >>descendants are omitted/not outputted. So we don't need the -but-search >>suffix because we won't find any selected nodes inside - this is an >>optimization. >> >>The unselected subtree D is in an area where no nodes are selected, but if >>we label it 'exclude', the algorithm won't traverse it and we can save >>time. But without that label, it'd work, though. >> >>The selected-subtree E in the middle is "include-but-search" because itself >>and all it's descendants are to be outputted. >> >>On the right side, we have this annoying exclude-but-search, >>include-but-search, exclude-but-search alternation because here are no nice >>subtrees which are the main focus of this transform but a 'real' nodeset >>which could also be handled by our old XPath transform... >> >>We have 7 contingous areas or nodes and NEED 7 labels (XPath evaluations) >>for this, while we can optimize the speed with 1 additional label (the >>"exclude" for node D). >> >>Is my idea now a little bit clearer now? >> >> >>Best regards, >>Christian >> > > >----------------------------------------------------------------------------- >Baltimore Technologies plc will not be liable for direct, special, indirect >or consequential damages arising from alteration of the contents of this >message by a third party or as a result of any virus being passed on. > >This footnote confirms that this email message has been swept by >Baltimore MIMEsweeper for Content Security threats, including >computer viruses. > http://www.baltimore.com >
Received on Tuesday, 19 March 2002 19:06:37 UTC