Re: Streamable XPath Profile Review from Meiko Jensen on 2010-08-10 (public-xmlsec@w3.org from August 2010)

From: Meiko Jensen <Meiko.Jensen@ruhr-uni-bochum.de>
Date: 10 Aug 2010 16:13:30 +0200
To: Frederick.Hirsch@nokia.com
Cc: pratik.datta@oracle.com, public-xmlsec@w3.org
Message-ID: <4C615E8A.4010301@ruhr-uni-bochum.de>
Frederick,

see below

Frederick.Hirsch@nokia.com schrieb:
> Meiko
>
> Could you please elaborate on "However, this would imply allowing more complex predicates for non-final XPath steps."
>   
As far as I understood Pratik correctly, the initial version of the
Streaming XPath Profile did not support any predicates in non-final
XPath steps. By now we agreed to soften this by allowing at least the
position predicate (e.g. "[12]", which is equivalent to
"[position()=12]"). However, still, there are no complex predicates
allowed, as position information is the ONLY supported predicate here
(as far as I understood).

With this suggestion of mine I just wanted to stress that in order to
support the [local-name()="X"] approach we would have to soften the
limitation once more, allowing also the local-name() and namespace-uri()
functions, which makes writing a conformant streaming XPath parser way
more complex again (in FastXPath, we just sticked to parsing the digits
out of [12] and ignored the expanded version).
> I'm not sure I understand what would be more complex (as opposed to more verbose)
>   
More complex is the XPath expression parser, and the fact that a user
can mix position(), local-name(), and namespace-uri() in combinations
with "and" and "or" to create weird XPath expressions (that the
Streamable XPath parser still has to support).
>   
>> Regarding this point: I once suggested the use of XPath expressions like
>> //*[local-name()="X" and namespace-uri()="Y"] to get around the
>> namespace prefix issues. Maybe we should explicitly allow---and
>> recommend---such constructs in the profile as well? However, this would
>> imply allowing more complex predicates for non-final XPath steps.
>>     
>
>
> I agree that streaming is a major feature for 2.0 and a driver for 2.0.
>
> Are there lessons to be learned from FastXPath, especially with regards to adoption, that we can apply to our work?
>   
I think most of what was essential about FastXPath is already covered in
the actual version of the Streamable XPath profile. The other points I
outlined in my review or put to discussion where approriate.

best regards

Meiko
> regards, Frederick
>
> Frederick Hirsch
> Nokia
>
>
>
> On Aug 10, 2010, at 8:47 AM, ext Meiko Jensen wrote:
>
>   
>> Hi Pratik,
>>
>> see inline
>> (this message still relates to Action-614 )
>>
>> Pratik Datta wrote:
>>     
>>> 1. The document by now is structured as a kind of "limitation of XPath
>>> 1.0". This means it is very hard to read (or understand) it without
>>> having the XPath 1.0 spec next to it. I'm not sure this is the ideal way
>>> to shape the spec, though being a "profile" for XPath.
>>>
>>> Pratik: Are you saying that the spec is pointing out limitations of XPath? That is definitely not the intention. It is rather a "diff" of the XPath 1.0 grammar and the proposed grammar. Instead of a diff we can put in the new grammar - that might make it easier to read.
>>>
>>>       
>> That's what I wanted to say: the document lists a diff between XPath 1.0
>> and Streaming. Instead, I'd propose writing the resulting grammar
>> itself, explaining its differences from full XPath 1.0 in plain text
>> where appropriate.
>>
>>     
>>> 4a. Allowing the descendant axis implies that it is possible to have
>>> XPaths resulting in more than one context node at each step. For
>>> instance, evaluating "//A//B" in the XML document
>>> "<A><B><A><A><B/></A></A></B></A>" has up to three different context
>>> nodes during its evaluation, and ends up with two nodes being part of
>>> the result, one of them being result of three different evaluation
>>> paths. We've been talking about this before. It is valid to have such
>>> selections, it just makes evaluation become more complex (potential for
>>> Denial of Service here!!), since you have to maintain more than one
>>> context node per streaming event.
>>>
>>> ------
>>> Pratik: I am not following this very well. What do you mean by a "context node during evaluation". We have seen use cases for something like this "//A". i.e. sign all <A> elements wherever they are in the document. So we do need to support the descendant axis. Maybe we can allow only one descendant axis?
>>>       
>> Imagine an XPath of "//A//B" evaluated against the document given above.
>> On streaming processing you encounter the first <A>, so first step of
>> the XPath is found. The next <B> marks the first "hit" of the XPath.
>> However, since this XPath used the descendant axis, the subsequent other
>> occurrences of <A> and <B> elements produce additional evaluation
>> contexts and hits. For instance, right after parsing the third <A>,
>> there are four different XPath evaluation contexts open concurrently,
>> three looking for a <B> (one for each previous occurrence of <A>), and
>> one still looking for an <A> (the initial XPath, stil searching for
>> matches to the step "//A"). If you omit such "wildcard" axes, the number
>> of concurrent evaluation contexts is reduced to 1, for all cases. /A/B
>> is either a hit (if document node is called A) or not, but not both at
>> once. Of course, you still can find more than one "XPath hit" here (as
>> in "<A><B/><B/></A>"), but the evaluation context always has a single
>> context node (first the document root, then the single A, then the first
>> B, then the single A, then the second B, then the single A, then the
>> document root, then you're done).
>>     
>>>
>>>
>>> 4b. Similar to 4a, having "|" allowed implies possibility of more than
>>> one context node.
>>>
>>> ------
>>> Pratik: Again I am not understanding this. But if we do not support this, then our selection cannot have multiple disjoint subtrees, and we would need a separate reference per subtree. To me that is very limiting,
>>>
>>>       
>> Well, it would still be possible to have multiple disjoint subtrees, as
>> long as they are selected using the same XPath (see <B> in the example
>> above). Besides that you are right, for other cases we'd need another
>> Reference for additional subtrees. We just have to decide whether we
>> want this or not. Just to clarify my position: I have no objections
>> against allowing descendant and the "or" operator. Both are streamable,
>> and the additional complexity is minimal. I just wanted to point out
>> that there is a less complex subset that has about the same capabilities
>> (but may require more than one Reference to be used).
>>     
>>> 4c. the profile explicitly excludes the use of the "text()" and "node()"
>>> operators in order not to run into complex text node collection issues,
>>> which are very difficult in streaming evaluators. However, XPath 1.0 has
>>> some loopholes that still run into selecting texts. For instance, the
>>> XPath 1.0 function of "string()" without explicit argument implies to
>>> convert the context node to string, which means concatenating all its
>>> text node descendants. Hence, we have implicit text() selection here. In
>>> the same line, I'm not yet convinced whether other conversion functions
>>> of XPath 1.0 (any "node-set to X conversion") might also result in
>>> implicit access to data that is not available in the streaming model.
>>> This would require an in-depth compatibility analysis between the
>>> Streamable XPath profile and XPath 1.0.
>>>
>>> ------
>>> Pratik: There is  one limitation that we have places that eliminates many of these problems.  Predicates can have expressions referring to attributes. E.g. you cannot do something like string(/foo/bar), you can only do string(@title). But yes, as you pointed out , we still allow string() with no arguments. I didn't realize that his led to text node collection. We would need to disallow this.  However we should still allow local-name() and namespace-uri()  with empty arguments, because they don't result in text node collection.
>>>
>>>       
>> Regarding this point: I once suggested the use of XPath expressions like
>> //*[local-name()="X" and namespace-uri()="Y"] to get around the
>> namespace prefix issues. Maybe we should explicitly allow---and
>> recommend---such constructs in the profile as well? However, this would
>> imply allowing more complex predicates for non-final XPath steps.
>>     
>>> 4e. The "following" and "following-sibling" axes in fact are streamable,
>>> yet I don't see a good reason to keep them in this profile since they
>>> would make evaluation get way more complex again.
>>>
>>> ------
>>> Pratik: I guess they are streamable, because they are forward only. We could add them.
>>> Regarding complexity, I think it is ok to be complex. XPath 1.0 is complex and 2.0 is even more. What we are proposing is much less complex than either. I am assuming most Signature implementation will just be DOM based and not use streaming at all.  Streaming is only for very performance sensitive usages.
>>>
>>>       
>> Agreed, however, to my consideration the streaming capability has been
>> one of the major goals behind the design of XML Signature 2.0, and I can
>> imagine a broad adaptation community for this in the Web Services world
>> (where performance really matters). Hence, we should not treat this as
>> some side-effect of 2.0, but merely as a major factor. In that light, we
>> have to decide about a general strategy for the streaming support: do we
>> want to have it most feature-complete hence most complex to implement,
>> or have it essentials-only but easy to implement (and maybe thus better
>> in terms of performance)?
>>
>> Again, I do not insist in having "following" in, but I have also no
>> objections to keep it out. Maybe we have to get back to this issue once
>> we've made a decision regarding the "relative XPath" issue (where the
>> XPath is relative to the <Signature> element or whatever). In such
>> cases, a "following-sibling" axis may be useful (e.g. in "sign the
>> "following" node  of the <Signature> element that is named soap:Body").
>>     
>>> 4f. The given grammar explicitly disallows predicates in non-final XPath
>>> steps. This contradicts with example no. 5 that says
>>> "/book/chapter[2]/title[1]" would be allowed. In fact, "chapter[2]" is
>>> an abbreviation of "chapter[position()=2]", hence being a regular
>>> predicate. In this case I'd strongly recommend to allow position
>>> indicators to be used in non-final XPath steps. This is one of the major
>>> requirements for fending signature wrapping attacks: the XPath
>>> expression must pin down the signed element's location precisely, hence
>>> would definitely need position indicators at every step. I'd even go so
>>> far as to recommend having position indicators being REQUIRED for every
>>> step.
>>>
>>>
>>> -----
>>> Pratik:  Good catch. Earlier we were not allowing predicates in non final XPath steps. I made some changes to allow that, but I guess I forgot to change the grammar.
>>>
>>> But I don't agree with pinning down the position.  Suppose somebody says  /soap:Envelope/soap:Body[1].   There should be only one <soap:Body> in a <soap:Envelope> element. However a rogue client might capture a signed message with one body, and then add a second body to the message -the signature will only be on the first one. The server which is expecting only one soap:body might look at the last one - so it will end up processing  work of the unsigned one.  To prevent this we should recommend using an XPath like this /soap:Envelope/soap:Body , with this if a rogue client attaches a second soap:Body , the signature will break, because the Xpath will select both the soap:Body ies.
>>>
>>>       
>> You are right, in this scenario the use of /soap:Envelope/soap:Body will
>> render that attack infeasible. However, if the application logic is
>> written in a way that it always processes the last soap:Body this means
>> that the application logic already contains an error, and that this
>> error is not inline with the XML Signature. In such a scenario, the
>> correct approach woulf have been to use an XPath expression like
>> /soap:Envelope[1]/soap:Body[position()=last()] in order to exactly
>> reproduce the behaviour of the application logic. (To anticipate: I know
>> that we kicked the last() function). What I wanted to say is that
>> pinning down the exact position is better than being "generous" and
>> hoping that the attack will become infeasible by other means (such as
>> you described).
>>     
>>>
>>> 4g. The specified subset of XPath in fact is very close to the FastXPath
>>> subset we've been proposing in 2009
>>> (http://ieeexplore.ieee.org/xpl/freeabs_all.jsp?arnumber=5175871&abstractAccess=no&userType=inst).
>>> One of the interesting differences, though, is that FastXPath also
>>> prohibits the use of the "or" operator within predicates. The rationale
>>> behind that was to stick to the "policy of reduction": during
>>> evaluation, every test in every step of the XPath either keeps a
>>> (single) context node or switches to the empty result set (and
>>> immediately cancels further evaluation). The "or" operator violates this
>>> policy: if the first condition evaluates to "false" (=> result-set gets
>>> empty), it is possible that the second condition might become "true",
>>> putting the context node back in to the result-set. Again, this causes
>>> additional parsing complexity (Denial of Service via huge amount of "or"
>>> conditions). However, this was a design decision for FastXPath that I'd
>>> recommend to consider for the Streamable XPath Profile, but I do not
>>> insist in its adoption.
>>>
>>>
>>> ----
>>> Pratik:  I do not understand this denial of service here. In the algorithm that I have in my mind, Streaming XPath evaluator is a filter. The entire XML document is streamed through this filter one event at a time, and the filter says Yes or No and passes it on to canonicalization.  No XML event is processed more than once. What do you mean by "keeping a context node"?
>>>
>>>       
>> That DoS was a theoretical threat which I wouldn't see as a real-world
>> attack vector. It was just a hint at the complexity issues you might run
>> into, especially if the "or" predicates would have been allowed to
>> contain full XPath expressions themselves.
>>
>> Again, I have no objections against "or", I just wanted to reason why
>> FastXPath excluded it.
>>     
>>> 5b. in "1. Introduction" list item "2.b." says that "/book/chapter[3]"
>>> is not expressable in XMLDSig's XPath Filter.
>>> "ancestor-or-self::chapter[ . = /book/chapter[3] ]" should do, yet I
>>> agree with the intention behind this sentence that it is way too complex.
>>>
>>> ------
>>> Pratik: I don't think this expression means the same thing. As far as I understand the equality operator compares the string-value of the node sets.  So  if /book/chapter[4]'s string value was exactly the same as /book/chapter[3], then ancestor-or-self::chapter[ . = /book/chapter[3] ]" will sign both of those chapters. In fact this will also match /book/appendix/chapter[1] if its content is the same.
>>>
>>>       
>> You're right, I thought node-set comparison with "=" would be
>> node-by-node identity comparison. Once again a weird design decision of
>> XPath....
>>
>>
>> best regards
>>
>> Meiko
>>
>>
>> --
>> Dipl.-Inf. Meiko Jensen
>> Chair for Network and Data Security
>> Horst Görtz Institute for IT-Security
>> Ruhr University Bochum, Germany
>> _____________________________
>> Universitätsstr. 150, Geb. IC 4/150
>> D-44780 Bochum, Germany
>> Phone: +49 (0) 234 / 32-26796
>> Telefax: +49 (0) 234 / 32-14347
>> http:// www.nds.rub.de
>>
>>
>>     
>
>   

-- 
Dipl.-Inf. Meiko Jensen
Chair for Network and Data Security 
Horst Görtz Institute for IT-Security 
Ruhr University Bochum, Germany
_____________________________
Universitätsstr. 150, Geb. IC 4/150
D-44780 Bochum, Germany
Phone: +49 (0) 234 / 32-26796
Telefax: +49 (0) 234 / 32-14347
http:// www.nds.rub.de
Received on Tuesday, 10 August 2010 14:14:04 UTC