Re: Poll on Exclusive Canonicalization from Donald E. Eastlake 3rd on 2001-06-20 (w3c-ietf-xmldsig@w3.org from April to June 2001)

From: Donald E. Eastlake 3rd <dee3@torque.pothole.com>
Date: Wed, 20 Jun 2001 09:14:08 -0400
To: "John Boyer" <JBoyer@PureEdge.com>
cc: "Donald E. Eastlake 3rd" <lde008@dma.isg.mot.com>, "IETF/W3C XML-DSig WG" <w3c-ietf-xmldsig@w3.org>
Message-Id: <200106201314.JAA0000066243@torque.pothole.com>
Hi John,

From:  "John Boyer" <JBoyer@PureEdge.com>
Date:  Mon, 18 Jun 2001 15:45:39 -0700
Message-ID:  <7874BFCCD289A645B5CE3935769F0B520C33F2@tigger.PureEdge.com>
To:  "Donald E. Eastlake 3rd" <lde008@dma.isg.mot.com>,
            "IETF/W3C XML-DSig WG" <w3c-ietf-xmldsig@w3.org>

>Hi Don,
>
><Don>
>>So, namespaces used by XPath expressions in XPath elements would need
>to
>>be declared.  Doesn't seem like a big problem to me.  
>
>Do you have any suggestions here? Would an IncludeNS element content
>of exclusive canonicalization algorithm elements which had an
>attribute whose values was a list fo prefixs (NMTOKENS) that would be
>considered used, even though their prefix did not appear to be used,
>do the trick?  So you might have
>  <Transform Algorithm="http://www.w3.org/2000/09/xmldsig#excludeC14N">
>    <IncludeNS Prefixes="foo bar etc"/>
>  </Transform>     
></Don>
>
><john>
>It seems that, at a minimum, the IncludeNS parameter would be a good
>improvement.  I don't see how to avoid referring directly to namespace
>prefixes given that they are directly used in such things as XPath
>expressions.  

The IncludeNS parameter seems like a natural minor extension to my
specification. It really just directs the exclusive canonicalization
to pretend that the listed Prefixes are "used" within the XML being
canonicalized even if it can't "see" that use. And it fits with the
NMTOKENS attribute type.

Based on your other comments, maybe the parameter should be called
AdditionalNS because IncludeNS may give the false impression that it
lists all the namespaces to include.

>Matters worsen, however, if the XPath (or whatever opaque concoction
>appears in an Object) refers to the namespace URI and not the namespace
>prefix.  E.g., what if there were an XPath transform that essentially
>said include each node if the result of namespace-uri() is equal to
>namespace-uri(here()), or some similar expression that queried the
>namespace context from which you are trying to make exclusions.

I'm not claiming that any canonicalization, certainly not the one I
specified, is a 100% solution. If people can do arbitrary
calculations, it can theoretically get very compicated, especially if
people can call function to dynamically compose and parse additional
XML at run time.  But this seems like it would be a very rare case for
which the ability and cost in effort to define a custom
canonicalization would be reasonable. URIs are a lot messier than
prefix tokens and trying to add an additional mechanism for them just
doesn't seem worth it.

I think my basic proposal is a 90%+ solution because it could be used
directly in many cases and, given its availability, more cases could
be trivially changed so that it filled their needs.  With IncludeNS
added, or whatever, I think its a 98%+ solution for the same reasons.

In the absence of the specification of an exclusive canonicalization,
every protocol is forced to expend the effort to devise and maintain
its own. And having done so, they are unlikely to every switch to a
standard exclusive canonicalization.  On the other hand, the existence
of an exclusive canonicalization and the more prominent it is and the
higher the level of implementation requirement, the more protoocls
will be natually designed so that exclusive canonicalization fits
their need.

>In general, I found this problem quite vexatious in the past because it
>would be far easier if we could simply tell whether a namespace node
>exists by virtue of inheritance versus redeclaration in the source
>document.  If we could do that, then we wouldn't need to worry about
>whether something is being used and could have instead simply said that
>c14n has a mode whereby it only writes declared namespace nodes.
></john>

Yes, it is unfortunate that the XPath data model is so destructive of
namespace declaration location information. But, even if it were not
and we had a c14n mode such as you describe, you would not have a 100%
solution. People could still write XML where the canonicalized node
did not operate properly without namespace declarations that appear in
ancestors. Of course, you could just tell them they had to also
declare them within the XML being canonicalized. But in this case, it
would always work for them to declare them at the apex of the
canonicalized XML and to so declare them would be exactly the same as
with my current proposal but listing their prefixes in an IncludeNS
parameter.

>>Use of exclusionary c14n outside of the limited context of signing
>>SignedInfo seems problematic from an implementers standpoint.  What is
>>the plan here?
>
>I don't understand your question. As I proposed them, these seem to me
>to be well defined algorithms that can operate anywhere inclusive
>canonicalization can be used.  What's problematic?
>
><john>
>It just seemed that figuring out how to specify what namespaces to keep
>would be harder in the general case, regardless of how it was done.
>Consider the IncludeNS parameter for an exclusionary c14n of an
>externally referenced document.  Aside from the difficulty of
>determining which namespaces it actually uses (e.g. the prefixes used in
>XPath expressions), it seems you'd have to go get the document just to
>find out what namespaces are used in its start tags so you could set up
>the IncludeNS parameter, which would need to be done before the core
>processing.

If you have remote XML, you have to read it as XML anyway. (If you are
treating it as a lump of opaque binary data, you don't have to worry
about canonicalization at all.) If you are canonicalizing a whole
document, so there is no question of it being surrounded by other XML,
you would use the current inclusive canonicalization. So it's only
where you are canonicalizing an external subdocument that this
question arises.  In which case, as I say, you have to have read it in
as XML to some sort of node structure at signature generation time
anyway.  You don't have to use exclusive canonicalization unless the
context of the subdocument can change before verification of the
signature.  If it can, you don't have to use IncludeNS with the
exclusive canonicalization unless there are namespaces the XML
requires which are declared in an ancestor and which can not be "seen"
to be used as an ordinary prefix.  IncludeNS is just a list of
prefixes to consider to have been used.

>Then, there's the case of a Reference within the same document, but with
>an XPath transform that includes a forest consisting of an arbitrary
>number of possibly pruned subtrees of the document.  This seems more
>generalized than applying the method to SignedInfo, which is one subtree
>with no pruning.  Maybe it isn't so bad, but I just haven't thought
>through cases like what happens if the so-called apex node is not in the
>node-set or what happens if the intervening node that declares a
>namespace is not in the node-set.  Maybe these cases just go through
>with the same "Then, you shot yourself in the foot" response.  

Well, it's not really that bad, I think.  Dropping an intervening node
that declares a namespace causes no unusual problem. The namespaces
declarations are smeared to all descendents by the XPath data model
(unless blocked by a redeclaration of the prefix). So the declaration
will appear in a descendent of the dropped node. If it doesn't appear
at an output apex node, exclusive and inclusive canonicalization are
identical.

I'm not sure what you are referring to when you say "the apex node is
not in the node-set" as the special actions in my exclusive
canonicalization at an "apex" node refer to an apex in the serialized
output, not an apex in the input XPath node set.

>Finally, I guess I'm a little concerned about whether the proposed
>method of deparenting the apex node is "supposed" to work or if it works
>by happenstance in the implementations tried so far.  Has anything been
>done to see whether this behavior is something we can count on going
>forward?

You mean the suggestion that DOM level two removeChild could be used
in some implementations? I don't know. It was there as a hint and
could be removed.  I specified exclusive canonicalization in terms of
the XPath data model used in inclusive canonicalization.  It does not
require any DOM stuff.

>Cheers,
>John Boyer
>Senior Product Architect, Software Development
>Internet Commerce System (ICS) Team
>PureEdge Solutions Inc. 
>Trusted Digital Relationships
>v: 250-708-8047  f: 250-708-8010
>1-888-517-2675   http://www.PureEdge.com <http://www.pureedge.com/>  	
> 	
></john>

Thanks,
Donald
Received on Wednesday, 20 June 2001 09:15:08 UTC