Re: Streaming XPath - additional material from our research background

Meiko,
There are definitely XPaths that are not streamable.  Here are two such 
examples, and they are not covered in the Barton paper.

-------------------------------------
The subset in the Barton paper does not include any functions. Some of 
the the functions are not streamable especially the pos() function.

For example suppose you have a document like this with a 1000 <b> elements.

<a>
  <b/>
  <b/>
  <b/>
  ...
  <b/>
</a>

And you want to evaluate the expression that wants to find out <b> that 
is at the 100th  position counting backwards.
i.e.
 /a/b[position() = last() - 100]

How can you evaluate this in streaming parser where you can go only one 
pass? You do not know how many <b> elements there are, so you must reach 
the end, and then go back.  Even if Xpath has some kind of lookahead 
cache, that cache cannot hold 900 <b> elements. (Or if it can, then you 
can change the problem to have zillion <b> elements)
-----------------------------

Another example  which is more specific to XML Signature.
We want the whole XML signature operation to be in one pass. I.e. Xpath 
selection, canonicalization, digesting  together have to be in one pass.

So Suppose you have the XPath  
/a/b[c]

so basically you want to sign the entire /a/b/  element, but only those 
/a/b elements which have a c child.

suppose your doc is like this
<a>
 <b>
     some very long content
     <c/>
  </b>
</a>


So you can definitely find out  /a/b[c]  in streaming,  but after you 
have found that out, you have to go back to the beginning of <b> and 
start the rest of the signing process - and that will break the 
streaming of the signature as a whole.

Pratik




On 10/29/2009 4:34 AM, Meiko Jensen wrote:
> Hi,
>
> as I lately noticed that the WG deals with similar problems as we do
> within our latest research (i.e. streamable subset of XPath in the
> context of XML Signatures), I'd like to point your attention to some of
> our findings for consideration and discussion.
>
> Though Barton et al. ( http://cs.nyu.edu/~deepak/publications/icde.pdf )
> have shown that in theory every XPath expression can be converted into
> an equivalent XPath that does not contain any backward axes (thus
> allowing stream-based evaluation in general), the topic of a streamable
> subset of XPath is of crucial importance. Apart from the pure
> performance gains by using a stream-based XML Signature validation (and
> maybe also application), one should also be aware of the other use that
> such a subset could have -- in terms of fending the XML Signature
> Wrapping attack. As we have shown lately (
> http://www.nds.rub.de/media/nds/downloads/mjensen/ICWS09.pdf ), this
> particular attack threat can be tackled using position-aware referencing
> schemes in XML Signatures, which obviously can be done e.g. using
> XPath-based transformations.
>
> We thus defined a strong subset of XPath ourselves (called FastXPath),
> which to our consideration provides both: it performs way better than
> full XPath (see evaluation in the paper) and additionally was shown to
> be way more resistant to the XML Signature Wrapping threat.
>
> Thus, if you are interested in determining on how our work relates to
> the ongoing discussion on streamable XPath, please feel free to contact me.
>
> Best regards from Bochum, Germany
>
> Meiko
>
>   

Received on Friday, 30 October 2009 20:06:23 UTC