Re: Clarifying XPath Filtering Transform text (pertains to Action-350, etc.) from Ed Simon on 2009-09-09 (public-xmlsec@w3.org from September 2009)

From: Ed Simon <edsimon@xmlsec.com>
Date: Wed, 09 Sep 2009 19:30:59 -0400
To: Frederick Hirsch <frederick.hirsch@nokia.com>
Cc: Scott Cantor <cantor.2@osu.edu>, 'XMLSec WG Public List' <public-xmlsec@w3.org>
Message-Id: <1252539059.3215.7.camel@XMLSEC-BIZ.phub.net.cable.rogers.com>
Last week, Scott highlighted that the XPath Filter 2 Transform does
allow output of any XPath-legitimate node set (the XPath Transform, in
contrast, will only result in productions matching elements). This gets
me back to the question I posed back in July in this post:

http://lists.w3.org/Archives/Public/public-xmlsec/2009Jul/0065.html

Focusing specifically on the XPath Filter 2 Transform

http://www.w3.org/TR/xmldsig-filter2/

I believe we still need to clarify what happens, or should happen,
with the following results (adapted from my linked post mentioned
above) from the XPath Filter 2 Transform:

For example, what is the prescribed
treatment of the following examples of node sets returned by an XPath
Filter 2 Transform in order to produce a hashable octet stream?:

* a node set containing an attribute node;

* a node set containing a text node; and

* a node set containing all the above plus an element node.

Ed


On Thu, 2009-07-30 at 16:53 -0400, Ed Simon wrote:
> My comments are inline too,
> Ed
> 
> 
> On Wed, 2009-07-29 at 10:54 -0400, Frederick Hirsch wrote:
> > Ed
> > 
> > comment inline, suggestion that we may need additional revision to  
> > 6.6.3 if we go with your proposed change.
> > 
> > regards, Frederick
> > 
> > Frederick Hirsch
> > Nokia
> > 
> > 
> > 
> > On Jul 28, 2009, at 8:31 PM, ext Ed Simon wrote:
> > 
> > > OK, so I'm looking at the C14N test cases in
> > >
> > > http://www.w3.org/TR/2008/NOTE-xmldsig2ed-tests-20080610/
> > >
> > > and I see there are some that introduce non-well-formed XML node- 
> > > sets as
> > > input to canonicalization. However, I don't see any examples in the  
> > > C14N
> > > 1.1 specification that deal with non-well-formed input node-sets and
> > > much of the text seems to imply input node-sets represent well-formed
> > > docs. (Let me know if you see something to the contrary.)
> > >
> > > So, I would suggest the following:
> > >
> > > 1) That the line in the W3C Canonicalization 1.1 specification that
> > > states
> > >
> > > "The first parameter of input to the XML canonicalization method is
> > > either an XPath node-set or an octet stream containing a well-formed  
> > > XML
> > > document."
> > >
> > > be changed to
> > >
> > > "The first parameter of input to the XML canonicalization method is
> > > either an octet stream or an XPath node-set. If it is an octet stream,
> > > the octet stream must contain a well-formed XML document."
> > >
> > > 2) Add some examples of canonicalizing non-well-formed input node sets
> > > to the C14N specification (they can be taken from the interop test
> > > cases).
> > >
> > > 3) A minor typo: look for any instances of "node set" in the C14N
> > > specification that should be "node-set".
> > >
> > 
> > 
> > we have to decide whether to update C14N 1.1 or simply go toward C14N2.0
> > 
> OK.
> 
> > >
> > > Now, re the description of XPath Filtering in the XML Signature
> > > specification. What one specifies in the <dsig:XPath> element is not  
> > > an
> > > XPath expression but an XPath Predicate Expression that when evaluated
> > > will be prepended with "(//. | //@* | //namespace::*)" to form the
> > > actual XPath expression that will be evaluated.
> > >
> > > As such, I propose the first sentence in section 6.6.3 which states
> > >
> > > "The normative specification for XPath expression evaluation is  
> > > [XPath].
> > > The XPath expression to be evaluated appears as the character  
> > > content of
> > > a transform parameter child element named XPath."
> > >
> > > be changed to
> > >
> > > "The normative specification for XPath expression evaluation is  
> > > [XPath].
> > > [XPath] defines "predicate expressions" (have link) which provide a
> > > boolean qualifier to node-set specifications. In the XML Signature  
> > > XPath
> > > Filtering transform, the node-set specification is defined as "(//.
> > > | //@* | //namespace::*)" and the predicate expression for that node- 
> > > set
> > > specification is specified through the content of the <XPath> element.
> > >
> > > For example, this XPath Filtering element
> > >
> > > <XPath>
> > > @signMe='true'
> > > </XPath>
> > >
> > > will result in the following XPath expression being evaluated against
> > > the input node-set:
> > >
> > > (//. | //@* | //namespace::*)[@signMe='true']
> > >
> > > ".
> > >
> > >
> > 
> > I'm not sure this change is absolutely necessary, since the text in  
> > 6.6.3 seems to discuss this in the following paragraph, though I  
> > think  what you propose could result in a more readable and  
> > understandable section:
> > 
> > [[
> > The input required by this transform is an XPath node-set or an octet- 
> > stream. Note that if the actual input is an XPath node-set resulting  
> > from a null URI or shortname XPointer dereference, then comment nodes  
> > will have been omitted. If the actual input is an octet stream, then  
> > the application MUST convert the octet stream to an XPath node-set  
> > suitable for use by Canonical XML with Comments. (A subsequent  
> > application of the REQUIRED Canonical XML algorithm would strip away  
> > these comments.) In other words, the input node-set should be  
> > equivalent to the one that would be created by the following process:
> >  • Initialize an XPath evaluation context by setting the initial node  
> > equal to the input XML document's root node, and set the context  
> > position and size to 1.
> >  • Evaluate the XPath expression (//. | //@* | //namespace::*)
> > 
> > ]]
> > 
> > I think what you have suggested is much clearer. If we adopt it, do we  
> > need to simplify/revise this second paragraph and the rest of the  
> > section  as well? Can you please look at the whole section and  
> > indicate what other changes you would make (or did you plan to remove  
> > the existing material and replace with your proposal?)
> > 
> The problem is that the text, in its original form, which I suggested
> changing, is incorrect. Yes, the text in the spec after it is relatively
> correct. The changed text I propose attempts to provide a technically
> correct introduction to what is detailed later.
> 
> I think we do need to make the subsequent text clearer wrt namespace
> handling and maybe provide some more examples. It would also seem to me
> that Toolkits should provide a way to make it easy for application
> developers to see exactly what octet stream is hashed. Btw, to
> implementors, does your Tookit provide a way to get the octet stream
> that is ultimately hashed?
> 
> 
> 
> > 
> > > Right now it seems to me that a complete, unambiguous understanding of
> > > the core specifications requires looking at the interoperability docs
> > > (which I almost never do as I'm not an implementor). I think the above
> > > changes would help clarify the specs.
> > >
> > > Again, let me know if my understanding is amiss.
> > >
> > > Ed
> > >
> > >
> > >
> > > On Tue, 2009-07-28 at 18:19 -0400, Scott Cantor wrote:
> > >> Ed Simon wrote on 2009-07-28:
> > >>> The English language, alas, makes it a little ambiguous as to  
> > >>> whether
> > >>> this means an XPath node-set has to be a well-formed XML document or
> > >>> only octet streams have to be well-formed XML documents. My
> > >>> understanding is that it is generally taken that the input has to  
> > >>> be a
> > >>> well-formed XML document whether it is an XPath node-set or an octet
> > >>> stream. (If so, we should clarify that in the Canonicalization
> > >>> specification.)
> > >>
> > >> Pretty (as in 100%) sure that's NOT the case. C14N is such a pain  
> > >> because
> > >> it's NOT assumed to be anything but a totally arbitrary node set.  
> > >> In the
> > >> octet stream case, it's a well-formed document, but not otherwise.
> > >>
> > >>> I also believe it would be sensible to support XPath expressions  
> > >>> that
> > >>> return generic XPath node sets. I'm guessing most implementations do
> > >>> this but I'd like to hear how. For example, what is the prescribed
> > >>> treatment of the following examples of node sets returned by an  
> > >>> XPath
> > >>> Filtering transform in order to produce a hashable octet stream?:
> > >>
> > >> In all cases, you apply c14n. I think you got lost because you took  
> > >> a wrong
> > >> turn on the input rules there.
> > >>
> > >> -- Scott
> > >>
> > >>
> > >>
> > >>
> > >>
> > >
> > >
> > 
> >
Received on Wednesday, 9 September 2009 23:31:45 UTC