- From: Gregor Karlinger <gregor.karlinger@iaik.at>
- Date: Tue, 11 Jun 2002 10:28:35 +0200
- To: <reagle@w3.org>
- Cc: "XMLSigWG" <w3c-ietf-xmldsig@w3.org>
- Message-ID: <038801c21121$fb498e90$51981b81@iaik.at>
Hi,
I vote for #2 for the following reasons:
1. more intuitive, eliminates the strange UNION behaviour of #1
2. makes it possible to arrange several set operations within a
single transform (more intuitive than a sequence of transforms,
less overhead)
3. easier to make optimizations such as Merlin has worked out
(thanks Merlin!)
Regards, Gregor
> -----Original Message-----
> From: w3c-ietf-xmldsig-request@w3.org
> [mailto:w3c-ietf-xmldsig-request@w3.org] On Behalf Of Joseph Reagle
> Sent: Monday, June 10, 2002 11:17 PM
> To: merlin; Christian Geuer-Pollmann; John Boyer
> Cc: w3c-ietf-xmldsig@w3.org
> Subject: POLL: Fwd: Re: XPath filter 2.0
>
>
>
>
> I'd like to move forward to a Last Call for this
> specification ASAP. Last
> week Merlin outlined some options [1], one of which being a
> new proposal
> that he gave some more details about [2] in response to Christian.
>
> So to conclude the thread I'd like to know what people prefer:
> 1. Move forward with what we have presently [3].
> 2. Specify Merlin's proposal (The timing of this will be
> dependent on properly representing the proposal (e.g., Merlin's time
> <smile/>) and then reviewing and iterating on it a few times
> to make sure
> we have it clear.)
>
> Please respond before Wednesday. I'm favoring option #1
> unless there's a
> couple of folks advocating/willing-to-write #2 without a
> strong dissent.
>
> [1]
> http://lists.w3.org/Archives/Public/w3c-ietf-xmldsig/2002AprJun/0288
> [2]
> http://lists.w3.org/Archives/Public/w3c-ietf-xmldsig/2002AprJun/0291
> [3] http://www.w3.org/TR/2002/WD-xmldsig-filter2-20020425/
>
> ---------- Forwarded Message ----------
>
> Subject: Re: XPath filter 2.0
> Date: Thu, 06 Jun 2002 12:22:47 +0100
> From: merlin <merlin@baltimore.ie>
> To: Christian Geuer-Pollmann <geuer-pollmann@nue.et-inf.uni-siegen.de>
> Cc: w3c-ietf-xmldsig@w3.org
>
> r/geuer-pollmann@nue.et-inf.uni-siegen.de/2002.06.06/10:29:55
>
> >> * Implementations SHOULD note that an efficient realization
> >> of this transform will not compute a node set at each
> >> point in this transform, but instead make a list of
> >> the operations and the XPath-selected nodes, and then
> >> iterate through the input document once, in document
> >> order, constructing a filtering node set N based upon
> >> the operations and selected nodes.
> >> .......
> >> . Efficiency is as with the current spec. Basically this
> >> fixes union.
> >
> >Does this mean that based on the operations, you make some
> tree-labeling
> >;-)) and then you make one tree-traversal to output the
> selected nodes?
>
> Yes, but this is the exact algorithm that implementations
> of the _current_ XPath Filter 2.0 transform _should_ use
> if a sequence of the current XPath Filter 2.0 transforms
> precedes c14n. This formulation of the filter simply makes
> it easier to express in terms of SHOULD language.
>
> >Sounds cool, in that case, the efficiency would be much
> better then the
> >current results of v2.0. What kind of algorithm do you use for
> >
> > - make a list of operations and selected nodes,
> > - decide based on this data which nodes are the result.
>
> If you're asking for a normative algorithm for the general
> case, then I would have to think for a while. Restricting
> myself somewhat:
>
> First, let's characterize a sequence of filters:
> Filter ::= (INTERSECT | SUBTRACT | UNION)+
>
> You will observe that adjacent SUBTRACT and INTERSECT
> operations can be idempotently reordered, and that
> adjacent operations of the same type can be computationally
> merged. However, I would expect that type of thing to be
> done in the XPath expressions so I will ignore this.
>
> Restricting myself to the following production, which captures
> all _common_ (in my opinion) use cases:
> SimpleFilter ::= A:UNION* (B:INTERSECT | C:SUBTRACT)* D:UNION*
>
> Iterate over the document.
> (define (include-node) ;; whether or not to include a node
> (cond ;; returns the first match
> ((encountered-any D) t)
> ;; if you've encountered a node in a trailing union
> ((encountered-any C) nil)
> ;; not if you've encountered a subtraction node
> ((null B) t)
> ;; if there are no intersect operations
> ((encountered-all B) t)
> ;; if you've encountered a node in each intersect node set
> (t nil))) ;; not otherwise
>
> Note that encountered-any returns nil if its parameter is an
> empty list of node sets, but encountered-all is true in this
> case. We can therefore express this concisely:
> (define (include-node)
> (or (encountered-any D)
> (and (not (encountered-any C)) (encountered-all B))))
>
> You can implement encountered-foo and therefore include-node
> strictly in terms of node labeling, a stack and iteration.
> Obviously you must also consider the input node set.
>
> Going from this to a fully general solution is fairly
> straightforward. Observe that UNIONs are subject to ALL
> subsequent INTERSECT and SUBTRACT operations, but no preceding
> ones, and that the entire filter is equivalent to:
> UNION/ Filter
>
> I can then say that a node is included if I have encountered
> a node in ANY ( UNION operation AND NOT ANY SUBSEQUENT SUBTRACT
> operation AND ALL SUBSEQUENT INTERSECT operations ).
>
> Work done to compute this is proportional to the number of
> filters but only done at a labeled node.
>
> Merlin
>
> >Christian
> >
> >
> >
> >
> >--On Donnerstag, 6. Juni 2002 01:09 +0100 merlin
> <merlin@baltimore.ie>
> >
> >wrote:
> >> Hi,
> >>
> >> Quick summary of options:
> >>
> >> 1. Current Spec
> >> . This is intuitive (in my opinion) because it is based on a
> >> linear sequence of set operations.
> >> . Typical (IMHO) use cases require 2 XPath evaluations.
> >> However, increasingly complex filtering requirements incur
> >> increasing cost; an arbitrarily complex expression requires
> >> an arbitrarily large number of simple XPath expressions.
> >> However, the standard XPath filter may be more useful for
> >> these anyway.
> >> . Operation can, in most cases, be commingled with c14n for
> >> efficiency, but:
> >> . The union operator is really ugly and unintuitive.
> >>
> >> 2. Christian's Spec
> >> . *I* do not believe this is as intuitive; it involves labeling
> >> nodes and then traversing the document, proceeding based
> >> on node labels (e.g., omit-but-traverse).
> >> . Typical use cases require 2 XPath evaluations. Increasingly
> >> complex filtering requirements can be solved in a fixed
> >> number (2/3) of increasingly complex XPath expressions.
> >> . Operation can be commingled with c14n for effiency.
> >>
> >> 3. Or, we can take a variant of the current spec. I won't
> >> detail it horrendously, but basically:
> >>
> >> . The XPath Filter 2.1 takes, as a parameter, a sequence
> >> of operations, each of which is characterized as a
> >> set operation (intersect, subtract, union) and an
> >> XPath expression.
> >> . Operation over an input node set is as follows:
> >> * Construct a node set N consisting of all the
> >> nodes in the input document.
> >> * Iterate through each of the operations.
> >> # Evaluate the XPath expression; the result is X.
> >> # Expand all identified nodes to include their
> >> subtrees; the result is Y.
> >> # Assign N = N op Y
> >> * Use the resulting node set N as a filter to select
> >> which nodes from the input node set will remain in the
> >> output node set, just as the XPath 1.0 filter. This is
> >> tantamount to intersection with the input node set.
> >> * Implementations SHOULD note that an efficient realization
> >> of this transform will not compute a node set at each
> >> point in this transform, but instead make a list of
> >> the operations and the XPath-selected nodes, and then
> >> iterate through the input document once, in document
> >> order, constructing a filtering node set N based upon
> >> the operations and selected nodes.
> >> * Implementations SHOULD note that iterating through the
> >> document and constructing a filtering node set N can
> >> be efficiently commingled with the canonicalization
> >> transform if canonicalization is performed immediately
> >> after this transform.
> >> . With this formulation, intersection and subtraction
> >> are IDENTICAL to the existing spec, with the only
> >> change being that you can put them in one transform
> >> or many.
> >> . Union is, however, much improved (in my opinion). You
> >> can only use it to include nodes that would be
> >> removed by a previous operation in the same transform.
> >> As a result, the output node set will only include
> >> nodes from the input node set.
> >> . Efficiency is as with the current spec. Basically this
> >> fixes union.
> >>
> >> I write this a while ago; thought I'd send it rather
> >> than delete it. It's probably wasteful to propose yet
> >> another option.
> >>
> >> Merlin
>
> -------------------------------------------------------
>
> --
>
> Joseph Reagle Jr. http://www.w3.org/People/Reagle/
> W3C Policy Analyst mailto:reagle@w3.org
> IETF/W3C XML-Signature Co-Chair http://www.w3.org/Signature/
> W3C XML Encryption Chair http://www.w3.org/Encryption/2001/
>
>
Received on Tuesday, 11 June 2002 04:27:52 UTC