RE: Clarifying XPath Filtering Transform text (pertains to Action-350, etc.)

From: Ed Simon <edsimon@xmlsec.com>
Date: Tue, 28 Jul 2009 20:31:22 -0400
To: Scott Cantor <cantor.2@osu.edu>
Cc: 'XMLSec WG Public List' <public-xmlsec@w3.org>
Message-Id: <1248827482.3601.68.camel@XMLSEC-BIZ.phub.net.cable.rogers.com>
OK, so I'm looking at the C14N test cases in 


and I see there are some that introduce non-well-formed XML node-sets as
input to canonicalization. However, I don't see any examples in the C14N
1.1 specification that deal with non-well-formed input node-sets and
much of the text seems to imply input node-sets represent well-formed
docs. (Let me know if you see something to the contrary.)

So, I would suggest the following:

1) That the line in the W3C Canonicalization 1.1 specification that

"The first parameter of input to the XML canonicalization method is
either an XPath node-set or an octet stream containing a well-formed XML

be changed to

"The first parameter of input to the XML canonicalization method is
either an octet stream or an XPath node-set. If it is an octet stream,
the octet stream must contain a well-formed XML document."

2) Add some examples of canonicalizing non-well-formed input node sets
to the C14N specification (they can be taken from the interop test

3) A minor typo: look for any instances of "node set" in the C14N
specification that should be "node-set".

Now, re the description of XPath Filtering in the XML Signature
specification. What one specifies in the <dsig:XPath> element is not an
XPath expression but an XPath Predicate Expression that when evaluated
will be prepended with "(//. | //@* | //namespace::*)" to form the
actual XPath expression that will be evaluated.

As such, I propose the first sentence in section 6.6.3 which states

"The normative specification for XPath expression evaluation is [XPath].
The XPath expression to be evaluated appears as the character content of
a transform parameter child element named XPath."

be changed to

"The normative specification for XPath expression evaluation is [XPath].
[XPath] defines "predicate expressions" (have link) which provide a
boolean qualifier to node-set specifications. In the XML Signature XPath
Filtering transform, the node-set specification is defined as "(//.
| //@* | //namespace::*)" and the predicate expression for that node-set
specification is specified through the content of the <XPath> element.

For example, this XPath Filtering element 


will result in the following XPath expression being evaluated against
the input node-set:

(//. | //@* | //namespace::*)[@signMe='true']


Right now it seems to me that a complete, unambiguous understanding of
the core specifications requires looking at the interoperability docs
(which I almost never do as I'm not an implementor). I think the above
changes would help clarify the specs.

Again, let me know if my understanding is amiss.


On Tue, 2009-07-28 at 18:19 -0400, Scott Cantor wrote:
> Ed Simon wrote on 2009-07-28:
> > The English language, alas, makes it a little ambiguous as to whether
> > this means an XPath node-set has to be a well-formed XML document or
> > only octet streams have to be well-formed XML documents. My
> > understanding is that it is generally taken that the input has to be a
> > well-formed XML document whether it is an XPath node-set or an octet
> > stream. (If so, we should clarify that in the Canonicalization
> > specification.)
> Pretty (as in 100%) sure that's NOT the case. C14N is such a pain because
> it's NOT assumed to be anything but a totally arbitrary node set. In the
> octet stream case, it's a well-formed document, but not otherwise.
> > I also believe it would be sensible to support XPath expressions that
> > return generic XPath node sets. I'm guessing most implementations do
> > this but I'd like to hear how. For example, what is the prescribed
> > treatment of the following examples of node sets returned by an XPath
> > Filtering transform in order to produce a hashable octet stream?:
> In all cases, you apply c14n. I think you got lost because you took a wrong
> turn on the input rules there.
> -- Scott
