Re: Streamable XPath Profile Review from Frederick.Hirsch@nokia.com on 2010-08-10 (public-xmlsec@w3.org from August 2010)

From: <Frederick.Hirsch@nokia.com>
Date: Tue, 10 Aug 2010 15:45:46 +0200
To: <Meiko.Jensen@ruhr-uni-bochum.de>
CC: <Frederick.Hirsch@nokia.com>, <pratik.datta@oracle.com>, <public-xmlsec@w3.org>
Message-ID: <DC23EADC-5DCA-454B-BDF8-D38945575429@nokia.com>
Meiko

Could you please elaborate on "However, this would imply allowing more complex predicates for non-final XPath steps."

I'm not sure I understand what would be more complex (as opposed to more verbose)

> Regarding this point: I once suggested the use of XPath expressions like
> //*[local-name()="X" and namespace-uri()="Y"] to get around the
> namespace prefix issues. Maybe we should explicitly allow---and
> recommend---such constructs in the profile as well? However, this would
> imply allowing more complex predicates for non-final XPath steps.


I agree that streaming is a major feature for 2.0 and a driver for 2.0.

Are there lessons to be learned from FastXPath, especially with regards to adoption, that we can apply to our work?

regards, Frederick

Frederick Hirsch
Nokia



On Aug 10, 2010, at 8:47 AM, ext Meiko Jensen wrote:

> Hi Pratik,
> 
> see inline
> (this message still relates to Action-614 )
> 
> Pratik Datta wrote:
>> 1. The document by now is structured as a kind of "limitation of XPath
>> 1.0". This means it is very hard to read (or understand) it without
>> having the XPath 1.0 spec next to it. I'm not sure this is the ideal way
>> to shape the spec, though being a "profile" for XPath.
>> 
>> Pratik: Are you saying that the spec is pointing out limitations of XPath? That is definitely not the intention. It is rather a "diff" of the XPath 1.0 grammar and the proposed grammar. Instead of a diff we can put in the new grammar - that might make it easier to read.
>> 
> That's what I wanted to say: the document lists a diff between XPath 1.0
> and Streaming. Instead, I'd propose writing the resulting grammar
> itself, explaining its differences from full XPath 1.0 in plain text
> where appropriate.
> 
>> 4a. Allowing the descendant axis implies that it is possible to have
>> XPaths resulting in more than one context node at each step. For
>> instance, evaluating "//A//B" in the XML document
>> "<A><B><A><A><B/></A></A></B></A>" has up to three different context
>> nodes during its evaluation, and ends up with two nodes being part of
>> the result, one of them being result of three different evaluation
>> paths. We've been talking about this before. It is valid to have such
>> selections, it just makes evaluation become more complex (potential for
>> Denial of Service here!!), since you have to maintain more than one
>> context node per streaming event.
>> 
>> ------
>> Pratik: I am not following this very well. What do you mean by a "context node during evaluation". We have seen use cases for something like this "//A". i.e. sign all <A> elements wherever they are in the document. So we do need to support the descendant axis. Maybe we can allow only one descendant axis?
> Imagine an XPath of "//A//B" evaluated against the document given above.
> On streaming processing you encounter the first <A>, so first step of
> the XPath is found. The next <B> marks the first "hit" of the XPath.
> However, since this XPath used the descendant axis, the subsequent other
> occurrences of <A> and <B> elements produce additional evaluation
> contexts and hits. For instance, right after parsing the third <A>,
> there are four different XPath evaluation contexts open concurrently,
> three looking for a <B> (one for each previous occurrence of <A>), and
> one still looking for an <A> (the initial XPath, stil searching for
> matches to the step "//A"). If you omit such "wildcard" axes, the number
> of concurrent evaluation contexts is reduced to 1, for all cases. /A/B
> is either a hit (if document node is called A) or not, but not both at
> once. Of course, you still can find more than one "XPath hit" here (as
> in "<A><B/><B/></A>"), but the evaluation context always has a single
> context node (first the document root, then the single A, then the first
> B, then the single A, then the second B, then the single A, then the
> document root, then you're done).
>> 
>> 
>> 
>> 
>> 4b. Similar to 4a, having "|" allowed implies possibility of more than
>> one context node.
>> 
>> ------
>> Pratik: Again I am not understanding this. But if we do not support this, then our selection cannot have multiple disjoint subtrees, and we would need a separate reference per subtree. To me that is very limiting,
>> 
> Well, it would still be possible to have multiple disjoint subtrees, as
> long as they are selected using the same XPath (see <B> in the example
> above). Besides that you are right, for other cases we'd need another
> Reference for additional subtrees. We just have to decide whether we
> want this or not. Just to clarify my position: I have no objections
> against allowing descendant and the "or" operator. Both are streamable,
> and the additional complexity is minimal. I just wanted to point out
> that there is a less complex subset that has about the same capabilities
> (but may require more than one Reference to be used).
>> 
>> 4c. the profile explicitly excludes the use of the "text()" and "node()"
>> operators in order not to run into complex text node collection issues,
>> which are very difficult in streaming evaluators. However, XPath 1.0 has
>> some loopholes that still run into selecting texts. For instance, the
>> XPath 1.0 function of "string()" without explicit argument implies to
>> convert the context node to string, which means concatenating all its
>> text node descendants. Hence, we have implicit text() selection here. In
>> the same line, I'm not yet convinced whether other conversion functions
>> of XPath 1.0 (any "node-set to X conversion") might also result in
>> implicit access to data that is not available in the streaming model.
>> This would require an in-depth compatibility analysis between the
>> Streamable XPath profile and XPath 1.0.
>> 
>> ------
>> Pratik: There is  one limitation that we have places that eliminates many of these problems.  Predicates can have expressions referring to attributes. E.g. you cannot do something like string(/foo/bar), you can only do string(@title). But yes, as you pointed out , we still allow string() with no arguments. I didn't realize that his led to text node collection. We would need to disallow this.  However we should still allow local-name() and namespace-uri()  with empty arguments, because they don't result in text node collection.
>> 
> Regarding this point: I once suggested the use of XPath expressions like
> //*[local-name()="X" and namespace-uri()="Y"] to get around the
> namespace prefix issues. Maybe we should explicitly allow---and
> recommend---such constructs in the profile as well? However, this would
> imply allowing more complex predicates for non-final XPath steps.
>> 4e. The "following" and "following-sibling" axes in fact are streamable,
>> yet I don't see a good reason to keep them in this profile since they
>> would make evaluation get way more complex again.
>> 
>> ------
>> Pratik: I guess they are streamable, because they are forward only. We could add them.
>> Regarding complexity, I think it is ok to be complex. XPath 1.0 is complex and 2.0 is even more. What we are proposing is much less complex than either. I am assuming most Signature implementation will just be DOM based and not use streaming at all.  Streaming is only for very performance sensitive usages.
>> 
> Agreed, however, to my consideration the streaming capability has been
> one of the major goals behind the design of XML Signature 2.0, and I can
> imagine a broad adaptation community for this in the Web Services world
> (where performance really matters). Hence, we should not treat this as
> some side-effect of 2.0, but merely as a major factor. In that light, we
> have to decide about a general strategy for the streaming support: do we
> want to have it most feature-complete hence most complex to implement,
> or have it essentials-only but easy to implement (and maybe thus better
> in terms of performance)?
> 
> Again, I do not insist in having "following" in, but I have also no
> objections to keep it out. Maybe we have to get back to this issue once
> we've made a decision regarding the "relative XPath" issue (where the
> XPath is relative to the <Signature> element or whatever). In such
> cases, a "following-sibling" axis may be useful (e.g. in "sign the
> "following" node  of the <Signature> element that is named soap:Body").
>> 
>> 4f. The given grammar explicitly disallows predicates in non-final XPath
>> steps. This contradicts with example no. 5 that says
>> "/book/chapter[2]/title[1]" would be allowed. In fact, "chapter[2]" is
>> an abbreviation of "chapter[position()=2]", hence being a regular
>> predicate. In this case I'd strongly recommend to allow position
>> indicators to be used in non-final XPath steps. This is one of the major
>> requirements for fending signature wrapping attacks: the XPath
>> expression must pin down the signed element's location precisely, hence
>> would definitely need position indicators at every step. I'd even go so
>> far as to recommend having position indicators being REQUIRED for every
>> step.
>> 
>> 
>> -----
>> Pratik:  Good catch. Earlier we were not allowing predicates in non final XPath steps. I made some changes to allow that, but I guess I forgot to change the grammar.
>> 
>> But I don't agree with pinning down the position.  Suppose somebody says  /soap:Envelope/soap:Body[1].   There should be only one <soap:Body> in a <soap:Envelope> element. However a rogue client might capture a signed message with one body, and then add a second body to the message -the signature will only be on the first one. The server which is expecting only one soap:body might look at the last one - so it will end up processing  work of the unsigned one.  To prevent this we should recommend using an XPath like this /soap:Envelope/soap:Body , with this if a rogue client attaches a second soap:Body , the signature will break, because the Xpath will select both the soap:Body ies.
>> 
> You are right, in this scenario the use of /soap:Envelope/soap:Body will
> render that attack infeasible. However, if the application logic is
> written in a way that it always processes the last soap:Body this means
> that the application logic already contains an error, and that this
> error is not inline with the XML Signature. In such a scenario, the
> correct approach woulf have been to use an XPath expression like
> /soap:Envelope[1]/soap:Body[position()=last()] in order to exactly
> reproduce the behaviour of the application logic. (To anticipate: I know
> that we kicked the last() function). What I wanted to say is that
> pinning down the exact position is better than being "generous" and
> hoping that the attack will become infeasible by other means (such as
> you described).
>> 
>> 
>> 
>> 4g. The specified subset of XPath in fact is very close to the FastXPath
>> subset we've been proposing in 2009
>> (http://ieeexplore.ieee.org/xpl/freeabs_all.jsp?arnumber=5175871&abstractAccess=no&userType=inst).
>> One of the interesting differences, though, is that FastXPath also
>> prohibits the use of the "or" operator within predicates. The rationale
>> behind that was to stick to the "policy of reduction": during
>> evaluation, every test in every step of the XPath either keeps a
>> (single) context node or switches to the empty result set (and
>> immediately cancels further evaluation). The "or" operator violates this
>> policy: if the first condition evaluates to "false" (=> result-set gets
>> empty), it is possible that the second condition might become "true",
>> putting the context node back in to the result-set. Again, this causes
>> additional parsing complexity (Denial of Service via huge amount of "or"
>> conditions). However, this was a design decision for FastXPath that I'd
>> recommend to consider for the Streamable XPath Profile, but I do not
>> insist in its adoption.
>> 
>> 
>> ----
>> Pratik:  I do not understand this denial of service here. In the algorithm that I have in my mind, Streaming XPath evaluator is a filter. The entire XML document is streamed through this filter one event at a time, and the filter says Yes or No and passes it on to canonicalization.  No XML event is processed more than once. What do you mean by "keeping a context node"?
>> 
> That DoS was a theoretical threat which I wouldn't see as a real-world
> attack vector. It was just a hint at the complexity issues you might run
> into, especially if the "or" predicates would have been allowed to
> contain full XPath expressions themselves.
> 
> Again, I have no objections against "or", I just wanted to reason why
> FastXPath excluded it.
>> 5b. in "1. Introduction" list item "2.b." says that "/book/chapter[3]"
>> is not expressable in XMLDSig's XPath Filter.
>> "ancestor-or-self::chapter[ . = /book/chapter[3] ]" should do, yet I
>> agree with the intention behind this sentence that it is way too complex.
>> 
>> ------
>> Pratik: I don't think this expression means the same thing. As far as I understand the equality operator compares the string-value of the node sets.  So  if /book/chapter[4]'s string value was exactly the same as /book/chapter[3], then ancestor-or-self::chapter[ . = /book/chapter[3] ]" will sign both of those chapters. In fact this will also match /book/appendix/chapter[1] if its content is the same.
>> 
> You're right, I thought node-set comparison with "=" would be
> node-by-node identity comparison. Once again a weird design decision of
> XPath....
> 
> 
> best regards
> 
> Meiko
> 
> 
> --
> Dipl.-Inf. Meiko Jensen
> Chair for Network and Data Security
> Horst Görtz Institute for IT-Security
> Ruhr University Bochum, Germany
> _____________________________
> Universitätsstr. 150, Geb. IC 4/150
> D-44780 Bochum, Germany
> Phone: +49 (0) 234 / 32-26796
> Telefax: +49 (0) 234 / 32-14347
> http:// www.nds.rub.de
> 
>
Received on Tuesday, 10 August 2010 13:46:32 UTC