RE: FW: Last Call for XML Signature 2.0, Canonical XML 2.0 and XML Signature Streaming Profile of XPath 1.0

Responses to the last comment

 

>>In general I don't think it is good idea to create yet another XPath subset. Proliferation of XPath subsetting prevents using standalone XPath libraries when implementing various subsets of the language. If streaming is necessary then effort should be derived from XSLT 3.0 which provides streaming facilities.

 

 

We have had discussion with the XSLT group in TPAC 2010.

 

The XPath subset defined in XSLT 2.1/3.0 is not directly reusable, because there the streaming is defined in conjunction of XPath and XSLT constructs.  For example take an input xml document containing a list of books and their prices. Now say an XSLT transforms this input document by selecting only the book prices using an xpath and then outputting them to become a list of prices   using the <xslt:for-each> construct. This XSLT is streamable. However if the XSLT does an <xslt:sort> inside the <xslt:for-each> to sort the book prices before outputting them, then this second XSLT is not streamable.  In both these case the XPath used is exactly the same, the only difference between the XSLTs is the usage of the <xslt:sort> in the second case, which makes it non-streamable.

 

XSLT has a complex way of determining whether a given XSLT is guaranteed streamable, this involves putting the XSLT and XPath constructs into an "expression tree" and then constructing a "data flow graph" from it.  Streamability is determined by analysis of this data flow graph.

 

 

The XPath subset that we defined in XML Signature, is standalone. I.e. given the XPath expression one can know if it is streamable or not. Also streamability is defined in terms of grammar and not a data flow graph, as we feel that this makes it easier for the  end-user to understand how to make a streamable Xpath. However this also means that there exist two different Xpaths which are semantically the same but syntactically different, only one of which is considered streamable. But we feel that  this is less of concern than if the Xpath subset was too hard to explain.

 

In XML Signature, the goal of Streaming is that signature digest should be computable without having to load the entire document in memory. Here the Xpath selects the parts of the document to be signed, which may be most or all of the document.  Suppose the input document is 1GB, and the Xpath expression selects 95% of this document, then the signature processer should be able to compute the digest without requiring to have 1GB of even 950MB of memory.  It should only require a small amount of memory.   This requires that all the operations required for signature digest computation i.e. xpath selection, canonicalization, and digesting have to be performed in streaming mode.

 

 

The XML Signature processer works as follows:   It is assumed that all the Xpaths are known in advance (in case they are not, one can do an initial pass to collect them).  The signature processor at first compiles each Xpath into a state engine, and then the signature processor streams through the document, and checks if each node is included in the XPath or nor by checking if the state engine accepts that node. , if it is included then the “xml stream event” should be passed to the canonicalizer, which will do things like remove spaces, sort attributes etc, and then pass it to a running digestor which will compute a SHA1 digest. 

 

 

Because of the different of the processing model, the subsets defined by XSLT and XML Signature and different.  For example XSLT's xpath subset supports  parent and ancestor axis which Signature doesn’t and Signature's xpath subset supports following, and following-sibling axis, which XSLT doesn’t.

 

Another aspect to the XML Signature's XPath subset is to reduce the attack surface. When XML signature is used for message security, it is the first line of defence, i.e. a completely untrusted message will come in and the signature processor has to execute the Xpaths in this untrusted message. A rogue message should not cause the signature processor to go into infinite loops or execute system functions etc. That is why all the XPaths included in the subset are evaluatable in one pass without any buffering or backtracking.

 

 

We would like to promote reuse of this Xpath subset.  XML Signature's use of XPath is not unique. WS-Transfer 's use of XPath , aligns with XML Signature use, because in WS-Transfer the XPath selects the part of the document that needs to be transferred. I imagine there will be other use cases too with the same requirements as XML signature.  But in our investigation we didn't find any documented xpath subset that fits this requirement.

 

Pratik

-----Original Message-----
From: Grosso, Paul [mailto:pgrosso@ptc.com] 
Sent: Monday, June 06, 2011 12:08 PM
To: public-xmlsec@w3.org
Subject: FW: FW: Last Call for XML Signature 2.0, Canonical XML 2.0 and XML Signature Streaming Profile of XPath 1.0

 

Forwarding from XML Core to XML Signature WG.

 

paul

 

-----Original Message-----

From: Jirka Kosek [mailto:jirka@kosek.cz]

Sent: Tuesday, 2011 May 31 4:03

To: Grosso, Paul

Cc: public-xml-core-wg@w3.org

Subject: Re: FW: Last Call for XML Signature 2.0, Canonical XML 2.0 and XML Signature Streaming Profile of XPath 1.0

 

On 27.4.2011 15:37, Grosso, Paul wrote:

> The XML Core WG has been asked to review these specs before the end of 

> May.  Jirka and Norm have actions to do so and report back to the WG.

 

Hi,

 

I spent very limited time on this and haven't time to review RELAX NG schemas at all. Below are few issues I have found. I'm also attaching HTML rendering.

 

                        Jirka

 

1 XML Signature Syntax and Processing Version 2.0

--------------------------------------------------

[http://www.w3.org/2008/xmlsec/Drafts/xmldsig-core-20/]

* Specification uses term "XML namespace URI" instead of "namespace name"

  Although this probably doesn't create confusion, such informal term

  shouldn't appear in W3C spec. Either proper term "namespace name"

  should be used (see [http://www.w3.org/TR/xml-names/#dt-NSName]) or at

  least "XML namespace URI" should be put into Appendix A - Definitions

  and be properly defined here as a synonym of "namespace name".

* Insufficently defined context for XPath evaluation in § "10.6.1 Selection of XML Documents or Fragments"

  XPath 1.0 specification defines the following properties for context

  ORG-BLOCKQUOTE-START

  a node (the context node)

  a pair of non-zero positive integers (the context position and the context size)

  a set of variable bindings

  a function library

  the set of namespace declarations in scope for the expression

  ORG-BLOCKQUOTE-END

  Only the context node is defined in this specification, other

  properties should be defined as well.

* Typo in § "11.3 Namespace Context and Portable Signatures"

  In addition, the Canonical XML and Canonical XML with Comments

  algorithms import all XML namespace attributes (such as *xml:lang*) from

  the...

 

  There shouldn't be `xml:lang', but namespace declaration attribute like `xmlns:foo'.

 

  Also using entity references in examples as content of namespace

  declarations looks quite confusing.

* § "B.7.2 Base64"

  Transformation as described assumes that operates on text node --

  otherwise it will always return empty string. I'm not sure whether

  this is correct assumption. Omitting operation 1) will fix this

  problem.

 

2 XML Signature Streaming Profile of XPath 1.0

-----------------------------------------------

[http://www.w3.org/2008/xmlsec/Drafts/xmldsig-xpath/] In general I don't think it is good idea to create yet another XPath subset. Proliferation of XPath subsetting prevents using standalone XPath libraries when implementing various subsets of the language. If streaming is necessary then effort should be derived from XSLT 3.0 which provides streaming facilities.

 

 

 

--

------------------------------------------------------------------

  Jirka Kosek      e-mail: jirka@kosek.cz      http://xmlguru.cz

------------------------------------------------------------------

       Professional XML consulting and training services

  DocBook customization, custom XSLT/XSL-FO document processing

------------------------------------------------------------------

OASIS DocBook TC member, W3C Invited Expert, ISO JTC1/SC34 member

------------------------------------------------------------------

 

Received on Wednesday, 8 June 2011 00:34:42 UTC