RE: XSL WG comments on XML Signatures from John Boyer on 2000-03-15 (w3c-ietf-xmldsig@w3.org from January to March 2000)

From: John Boyer <jboyer@PureEdge.com>
Date: Tue, 14 Mar 2000 17:19:39 -0800
To: "Jonathan Marsh" <jmarsh@microsoft.com>, "IETF/W3C XML-DSig WG" <w3c-ietf-xmldsig@w3.org>
Cc: <w3c-xsl-wg@w3.org>
Message-ID: <BFEDKCINEPLBDLODCODKGEOGCBAA.jboyer@PureEdge.com>
The primary use in both XSLT and XPointer is to locate nodes relative to a
context node, and thus the LocationPath assumes special importance.  As a
constrained version of Expr, it always returns a nodeset given a context
node.  It seems to me your problem also could benefit from LocationPaths.

<John>
The usual scenario for the XPath transform is to parse the input and obtain
a node-set for serialization.  Therefore it does benefit from LocationPaths
because they can be used within an Expr.
</John>


Yes, Expr could be defined in your spec as the entry point.
In XSLT, certain attributes are defined as Expr and others as LocationPath.
It
should be clarified in the spec exactly which entry point you intend.  I
assumed LocationPath.

<John>
It does not need clarification because we are doing what the spec says.
XSLT needed clarification because they were not always doing what the XPath
specification said.  Your assumption of LocationPath does not constitute it
being 'odd' that I have used the default.
</John>

XPath states: "Expression evaluation occurs with respect to a context. ...
The context consists of: a node (the context node); a pair of non-zero
positive integers (the context position and the context size) ..."

Thus the context position and context size may not be set to zero, and it is
a reasonable assertion that the context node may not be omitted (a null
context node doesn't seem like a context node to me).  I think this goes
beyond "odd".

<John>
So, now it's not simply odd, it's beyond odd!  That's interesting.  Let me
get this right.  It's not odd for XSLT to hi-jack the defined language start
symbol in XPath, but it is *beyond* odd to use an empty node-set as a
starting context (given that the variable bindings and function library
provide a way to create a node-set when needed).

I'm sorry but this sounds like a lot of knit-picking when the reality is
that you simply read the document the way I read it at first and assumed
that everything started with LocationPath, and the authors of XPath simply
did not design it that way (and they seemed to have reason for this given
that XSLT uses Expr as a basis in many places).

As for the starting context of the XPath transform, disallowing an empty
node-set in the starting context is clearly a minor oversight on the part of
the authors of XPath.  It is necessary for XPath implementations to be able
to handle empty node-sets.  Check for example the result of the boolean()
function applied to a node-set.  It is true iff the node-set is non-empty.
Therefore, empty node-sets must exist.  What do you suppose the context
node, position and size should be for empty node-sets?  null, 0 and 0.
</John>

> 1) Everything I did in specifying the XPath transform is a
> kind of extension
> that is permitted by the XPath recommendation.  So, for
> example, I created
> the functions parse() and serialize() because the transform needed
> additional *function*ality, so rather than just making up
> whatever I needed,
> I specified it in terms of a function library addition, which
> is permitted
> by XPath.

I'll grant you that these are legal in an Expr, but not that this will be a
familiar use of XPath to users.  XPath states "The primary purpose of XPath
is to address parts of an XML [XML] document. In support of this primary
purpose, it also provides basic facilities for manipulation of strings,
numbers and booleans."  It seems to me that you have inverted this in your
model.

<John>
I haven't inverted anything.  The XPath specification states that the
starting language symbol is Expr.  I am using the default.  How is that an
inversion???  Furthermore, the primary purpose of the XPath transform is to
address parts of the document.  It is only put together in this particular
way to allow a little more flexibility in use.
</John>

This is of course a philosophical debate, and as such cannot be
"won", but it is our responsibility to at least comment that looks like a
potential source of confusion and inconsistency with existing XPath useage.

> 2) In my original design, I did as you suggested by putting the parsed
> version of the input as the context node.  However, there
> were some nagging
> little problems where people wanted to start with a fragment
> of XML, then
> transform.  Unfortunately, we don't have XML processors that
> work on XML
> fragments.

Do you mean multiple top-level elements?

<John>
Yes, you may have multiple elements as the result of a previous operation,
or you may simply want to add encoding information and a byte order mark to
the front of some UTF-16 markup that would not otherwise parse.
</John>

> So, by making a function parse(), it seemed easy
> to prepend an
> XML declaration (and a byte order mark, if needed) using
> string functions
> available in XPath, then parse the result and use the output
> node-set in
> further location steps.

I'm not completely familiar with your scenario, but doesn't this allow the
user to prepend any old byte order mark and declaration, even if they don't
accurately describe the XML data?  If they aren't prepended correctly, a
parse error could result, right?  It appears that the encoding and BOM are
available as variables, so is it the case that an author must prepend these
variables only (and always) in order for the parse to be successful?  Why
can't this be automated?

<John>
Yes, they can produce an error if they do it wrong, but it isn't like anyone
is ever going to successfully create a signature from an incorrectly
specified transform.

Further, no it can't be automated.  The BOM and encoding variables refer to
the encoding of the XPath expression, not the transform input.
</John>

> 3) In my original design, I assumed that the output would be
> automatically
> serialized.  Then it occured to me that a similar argument to
> that above
> could be applied to say that perhaps some minor fix-up of the
> output would
> be needed and could be done using XPath's string functions.

Scenarios?  If the need for parse() is eliminated, is there still a need for
this?

<John>
Symmetric doesn't mean equal.  parse() allows modification *before* input.
serialize() allows us to wrap changes around it, so we can make changes
*after* the output.
</John>

> So, given that I already felt that the cleanest way, from a
> specification
> standpoint, of adding functionality to XPath was to add
> functions, I now
> also found that I had a use for them.
>
> There are a few additional comments in line below marked by
> <john></john>,
> but hopefully you now see that the current design is works, offers
> additional functionality, and is specified in a way that most closely
> matches the enhancement methods suggested by XPath.
>
> Thanks,
> John Boyer
> Software Development Manager
> PureEdge Solutions, Inc. (formerly UWI.Com)
> jboyer@PureEdge.com
>
>
>
> -----Original Message-----
> From: w3c-ietf-xmldsig-request@w3.org
> [mailto:w3c-ietf-xmldsig-request@w3.org]On Behalf Of Jonathan
> Marsh (by
> way of "Joseph M. Reagle Jr." <reagle@w3.org>)
> Sent: Tuesday, March 14, 2000 11:43 AM
> To: IETF/W3C XML-DSig WG
> Cc: w3c-xsl-wg@w3.org
> Subject: XSL WG comments on XML Signatures
>
>
> The XSL WG took a look at the
> http://www.w3.org/TR/xmldsig-core/ draft,
> especially at the XPath and XSLT transformation sections, and had some
> serious concerns.
>
> The section on XPath Filtering provides a capability that has
> often been
> requested - the ability to take shape a nodeset returned by
> the XPath query
> into an XML result.  We feel the model you provide will be a valuable
> contribution.  However, as consituted, the current
> formulation does not use
> XPath in a way consistent with the XPath recommendation.  We
> recommend a
> substantial redesign of this section.
>
> <john>Obviously, this has been refuted.</john>
>
> Notes on particular issues follow.
>
>
> [6.3.3 XPath Filtering] "The XPath transform output is the result of
> applying an XPath expression to an input string."
>
> Conceptually this is odd.  XPaths is designed to be applied to XML
> documents, not to strings.  Serious problems arise from this later on.
>
> <john>The string contains an XML document.  If it does not,
> then functions
> can be used to fix it up before passing it to parse()</john>
>
> "The XPath expression appears in a parameter element named XPath."
>
> No examples of this are shown; I assume the syntax is:
>
>   <Transform Algorithm="http://www.w3.org/TR/1999/REC-xpath-19991116">
>     <XPath>expression/goes/here</XPath>
>   </Transform>
>
> <john>Yes, this is consistent with how all parameters are
> handled, so there
> did not seem to be a need for me to put an example when none
> of the other
> sections contain one. The </john>

I would have appreciated one.  And others in the other sections too.

<John>
Send a letter to the editors.  While you're at it, send a letter to the guys
who wrote the XML spec.
In our case, though, many of our examples are going to be placed in the XML
Signature Scenarios/FAQ document.
</John>

> But the XPath element does not appear to be in the DTD.  An
> example would be
> useful.
>
> <john>True. The editors changed the transforms such that the
> additional
> parameter was needed (the xpath expression used to be the
> direct content of
> the transform), but they have not yet modified the DTD.</john>
>
> "The primary purpose of this transform is to omit information
> from the input
> document that must be allowed to vary after the signature is
> affixed to the
> input document."
>
> Despite this claim, the mechanism uses an XPath to describe
> the information
> that is to be retained, instead of the information that is to
> be omitted.
> Either the mechanism or this description of what is going on should be
> adjusted for consistency.
>
> <john>The result of the XPath is indeed the retained
> material, however, I do
> not see any need for a consistency adjustment.  It is absurd
> to say that the
> omitted information will be the result of the XPath
> expression, since then
> it would not be omitted.
> The purpose of using XPath is that it has a 'not' function
> that allows us to
> precisely define that which should be omitted from the input
> document, and
> the result of those omissions is what we will retain.
> </john>

I'm suggesting that not() is fairly unwieldy if it is the common case.
Instead the filtering mechanism could build "not" into it.  Any nodes
identified by the XPath would be omitted from the result (along with their
children, of course).  This would seem to have a number of advantages:

<John>
All of my transforms can be reduced to using a single call of not(), since
not x and not y and not z = not (x or y or z).
</John>

- simplifies XPaths since you don't need tons of not()s all over the place
- conceptually, this lives up to the name "filtering"
- the default is that the entire tree will pass through -- an empty XPath
query would mean no change
- simplifies serialization algorithm
- improved perf when a generic XPath processor is used (the nodeset of
affected items is much smaller)

But on the other hand - are there situations where you want to filter out an
element, but keep it's children in some way?

<John>
Yes.  We want a scenario where we can extract any subset of the nodes of an
XML document.  My own use often excludes a node and its descendants, but we
are not constraining this transform to just my uses.
</John>

> [6.6.3.1 Evaluation Context Initialization] "A context node,
> initialized to
> null."
>
> Without a context node, the XPath cannot be applied against
> an XML tree.  We
> suggest that an XPath transform parses the document in all
> cases (not just
> when the parse() function is called), and the context node be
> set to the
> root of the parsed XML document.  The context size and
> position can then be
> initialized to 1, consistent with XPointer and XSLT.
>
> <john>It is not necessary to be consisten with XPointer and
> XSLT.  It is
> only necessary that we use XPath in ways permitted by XPath.  I have
> previously shown that this is the case, contrary to your
> claim above</john>

Besides the arguments presented above, I do think consistency with XSLT and
XPointer is desirable from the users perspective.

<John>
What is the inconsistency?  So far, the trivial observation that I start
with an empty node-set, which XPath must support, is the only thing that's
different, other than the fact that XSLT stepped out of the bounds of XPath
by using LocationPath when Expr is the defined root of the language.
</John>

> "(Typically, $input is passed directly to parse(), but if
> $input does not
> contain a well-formed XML document, XPath functions such as
> concat() can be
> used before passing the result to parse())."
>
> The need for this functionality is unclear, but seems to be a
> motivating
> factor in XPath abuse throughout this section.  If indeed
> this need must be
> fulfilled, it should be accomplished by a separate mechanism prior to
> application of XPath to the parsed (thus guaranteed
> well-formed) document.
>
> <john>Why should it be fulfilled by a separate mechanism when
> XPath has
> sufficient string functions to do many fix ups.  If more
> types of fixup are
> required, then we could add additional functions to the XPath
> transform
> library.  That is the appropriate way to extend XPath</john>
>
> "An empty set of namespace declarations. (Note: It is
> possible to address a
> node by its qualified name, even though the evaluation
> context has not been
> initialized with a declaration of the namespace. The XPath
> language provides
> the functions namespace-uri() and local-name() for this purpose)."
>
> It appears quite easy in this syntax (being XML) to allow the user to
> declare namespaces for use within XPaths.  XSLT and XLink
> both provide this
> capability.  Using namespace-uri() and local-name() hinders
> readability and
> impacts performance significantly.  This workaround should
> only be used as a
> last resort, and even then many feel that this mechanism is
> too unwieldy.
> We strongly recommend that a syntax for passing
> author-declared namespace
> bindnigs to the XPath evaluation context be developed.
>
> <john>This is more work than necessary given that we would
> then have to
> define how to resolve conflicts between the author-specified namespace
> declarations versus those appearing in the document</john>

Why?  Pass in all namespaces in scope - only the ones referenced in the
XPath will be needed during evaluation.  And the XML Namespaces mechanism
could be used to ensure there are no conflicts.  E.g.

  <Transform Algorithm="http://www.w3.org/TR/1999/REC-xpath-19991116">
    <XPath xmlns:foo="uri:foo">//foo:bar</XPath>
  </Transform>

This comment is based on my experience that not providing this functionality
results in severe backlash from the user community.  And most of this
experience comes from uses where providing namespace bindings is extremely
difficult (e.g. URIs).  Users have not been forgiving about these mitigating
circumstances.  In your case, this seems trivially easy, and I encourage you
to learn from my mistakes.

<John>
Fair enough.  If it is a real issue, then we can add a namespace binding in
precisely the way currently specified in XPointer.
</John>

> [6.6.3.3 Function Library Additions] "Function: node-set
> parse (stringInput,
> boolean LexOrder)"
>
> This function will not work as intended.  The XPath BNF
> prevents functions
> from being used as a location step - they can only appear
> within predicates.
> Thus parse()/x (which appears to be fundamental to your design) is an
> illegal use of a function.  We recommend that the parsed XML
> be provided to
> an XPath processor through the context node instead, with any
> necessary
> parsing controls specified on the XPath element (for example)
> and applied
> prior to XPath execution.
>
> <john>This has been refuted above</john>
>
> "Function:string serialize(node-set); This function converts
> a node-set into
> a string by generating the representative text for each node in the
> node-set."
>
> Under what circumstances would the serialize function NOT be
> called on a
> node-set return?  Since it appears that the vast majority (if not the
> entirety) of XPath Filtering operations will need to call
> this function,
> this capability should probably be built in instead of
> requiring the author
> to call it explicitly.
>
> <john>This has been refuted above</john>
>
> Are the serialization constraints consistent with
> canonicalization?  Is it
> inappropriate simply to say that the output is canonicalized
> instead of
> defining the exact representation here?
>
> <john>Canonicalization is more work.  If you want to
> canonicalize, use the
> c14n transform</john>

Why then constrain the serialization process so much?

<John>
We needed a way to say "Write the node set out as a string" because digest
algorithms work on octet streams (or bit streams or whatever), they don't
work on node-sets.
The material specified is almost the minimum necessary to say "write it
out".  We cover every type of node that is in XPath, and we tell how to
write it, including the issue of dealing with attribute order.  Other than
that, we don't make any changes.  The most offensive change, for example, is
that c14n creates attributes where there were none before, making non-local
changes to the document.

We are not constraining serialization 'so much', we are simply being
complete in specifying how to write a node-set.
</John>

It seems to me that this feature is introducing a kind of
mini-canonicalization and the relationship to full canonicalization is not
clear.

<John>
It is our goal to specify how to write a node-set, not to write a book to
tell you the difference between xpath transform serialization and c14n.  If
you want, you can discover the relation by reading the c14n spec.
</John>

> [6.6.3.4 XPath Transform Output] "serialize(parse($input,
> "true")/descendant-or-self::node()[
>      not(self::SignatureValue and parent::Signature[@id="S1"]) and
>      not(self::KeyInfo and parent::Signature[@id="S1"]) and
>      not(self::DigestValue and ancestor::*[3 and @id="S1"])]"
>
> If this is the intended usage scenario (omitting
> descendants), perhaps a
> mechanism based on XSLT match patterns (a subset of XPath) should be
> pursued.  Combined with an omission semantic instead of a retention
> semantic, the above might be simplified to:
>
>   Signature[@id="S1"]/SignatureValue | Signature[@id="S1"]/KeyInfo |
> *[3][@id="S1"]//DigestValue
>
> <john>Yes the above is an intended usage scenario.  One
> should also note
> that XSLT transforms may not be retained due to difficulties
> that will be
> harder to resolve than they were in XPath.  So, the suggestion to use
> something from XSLT may not be possible.  However, can you
> explain what the
> above expression does?  How exactly is it indicating that we
> want the whole
> document *except for* the SignatureValue in S1, the KeyInfo
> in S1 and the
> DigestValue in S1?
>
> Also, the above looks vaguely like an XPath expression.
> Perhaps you just
> have a different way of writing the same thing.
> </john>

Yes, XSL Patterns are derived from XPath expressions.  I guess match
patterns are a bad idea.  But I'm not convinced the "not" semantics
shouldn't be built in.  For instance:

  Signature[@id="S1"]/SignatureValue |
  Signature[@id="S1"]/KeyInfo |
  Signature[@id="S1"]/SignedInfo/DigestValue

This LocationPath identifies nodes which (along with their children) would
be trimmed from the tree.  I believe this would simplify the description of
the serialization mechanism, as well as provide a "positive" way for the
user to describe what is to be cut.

<John>
The design tenet I used was that I would not be adding to the XPath syntax
in this specification.  I would apply XPath as stated, extending it only in
ways allowed by the XPath recommendation.  If we extend the expression
syntax, then that guarantees that no current implementation could be used.

So, no.  Use the not() function.  That's what it is there for, and I'm quite
sure it can be made efficient because I currently have implementations in
XFDL of a similar concept known as signature filters which use omission
logic, and they are extremely efficient.

John Boyer
Software Development Manager
PureEdge Solutions, Inc. (formerly UWI.Com)
jboyer@PureEdge.com

</John>

> [6.6.4 XSLT Transform] "Identifier:
> http://www.w3.org/TR/1999/REC-xslt-19991116"
>
> <john>I don't care for this transform, so issues about it
> should be taken up
> with the editor</john>
>
> Why is this identifier used instead of the XSLT namespace?  All XSLT
> stylesheets contain version info already.
>
>
> "The Transform element contains a single parameter child
> element called
> XSLT, whose content MUST conform to the XSL Transforms [XSLT] language
> syntax. The processing rules for the XSLT transform are
> stated in the XSLT
> specification [XSLT]."
>
> This seems quite underspecified after the XPath Filtering
> section.  For
> instance, are similar parsing controls needed?  If not, why are they
> necessary in the XPath case?  Are similar serialization
> constraints needed?
> If not, why are they necessary in the XPath case?  Are certain output
> methods required (they are optional in XSLT).
>
>
> [4.3.3.1 The Transforms Element] "<!ELEMENT Transform (#PCDATA)>"
>
> The definition of the "Transform" element is #PCDATA.  This
> will not allow
> an XSLT stylesheet to be included.  The XML Schema defines it
> as "element
> only" but does not define the content.  This definition would
> not allow a
> naked XPath if that is your intent.
>
> In addition, the section mentions that the sequence of
> transformations can
> be XPath, XSLT, or some custom Java algorithm.  It seems rather
> underspecified how this sequence of transformations interact
> (e.g. XSLT
> and XPath operate on nodes and the Java operates on ???).
>
> <john>The output of each transform is the input to the next.
> There are no
> other interactions.  It's all strings of data.</john>
>
> We look forward to working with you to resolve these issues
> in a way that
> meets your needs and is consistent, implementable, and interoperable.
>
> - Jonathan Marsh
>   Microsoft
>
> (With contributions from Alex Milowski.)
>
Received on Tuesday, 14 March 2000 20:17:38 UTC