RE: Schema Centric Canonicalization algorithm



Thanks so much for taking the time to correspond. Here are some further thoughts on your thoughts.


Regarding the comment about exc-c14n and the default XML namespace prefix: I am glad to hear that this was rectified in the CR; we will update SCC14N accordingly in our next available draft.


Regarding canonicalizing esoteric node sets, such as nodes set with just a single attribute: I am at a loss to suggest as to how best to attempt to remedy c14n and exc-14n to correct for this; I assume you're much more adept in that area than I. However, were those my specifications to edit I would pretty strongly feel that that careful, specific attention should at least be drawn therein to this issue, since in these situations information is clearly lost during the canonicalization process (that is, the namespace identities) and this directly leads to a situation where contrary to the nature of signatures one can sign one node set yet successfully verify against another totally different one. This to me seems a security issue clearly worthy of note, even if we at this point can't be particularly sanguine about the times in which someone might actually attempt this. IMHO, it's not a matter of simply 'operating best', but a genuine issue of security (albeit pragmatically probably a minor one) that should be documented.


Regarding the motivation for namespace prefix desensitization: one big motivation for this (an immediate one for UDDI) comes concretely from the desire to use XML as a communications protocol to servers which may chose to store XML in other than its literal form, yet preserve signatures through a round-trip. For example, they may shred incoming XML into a relational structure, completely discarding the original XML after having done so. In such situations, one can usually design reasonable relational tables that straightforwardly mirror the semantic content of the data being communicated, and write shredding and reconstitution code to handle the translation. However, the additional burden imposed here if one is further required to preserve actual namespace prefixes used is large and significant: ns declarations can exist on each and every element, perhaps with many and several declarations for the same prefix, scoped according to the elemental hierarchy. Recording this in the relational store is relatively hard and difficult, and is often in tension relational-schema-wise with the attempt to model the semantics of the data in question. And most often (indeed, I do not know of a case otherwise) it is at its heart semantically pointless, since per the XML ns spec no semantics are attached to the actual prefix used, if only we can reliably locate and find them all. Thus, SCC14N provides a means to reliably do this, or reliably know when one is in a situation where one cannot.


Assuming the one can in fact reliably locate all uses of a namespace prefix during a rewriting process, I am not aware of any application (none at all) that would object to such a rewriting being done. That is, I am not aware of any that intrinsically rely on the use of specific prefixes. Do you know of any?


Regarding [normalized value]: If I understand correctly, there seems to be no actual issue here for us with respect to SCC14N. Moreover, I am greatly confused as to the confusion. Perhaps XSV has a bug, but there is no language in XML Schema Structures that I can find that writes to [normalized value] for default values or otherwise. Furthermore, Element Default Value (3.3.5) <>  (for example) clearly indicates that the default appears in the [schema normalized value], not [normalized value]. I remain at a loss as to where the original misunderstanding arose.


Regarding CDATA: I would agree that the information models of Infoset, XPath (ie: node set), and (apparently) DOM are different, that at times this causes difficulties for some, and that this is regrettable. However, the particular CDATA issue noted doesn't arise in the node set to Infoset conversion that we must in SCC14N undertake (due to the XML DSIG transformation requirements), and SCC14N being internally completely Infoset based also has no internal such conversion problems. That dealing with the DOM brings these issues to the fore is regrettable, but in the end personally I think simply reflects on the pragmatic utility of the DOM API to applications.


Regarding Canonicalizing URIs: Your text and reference are intriguing. We'll take another look as to whether it seems worthwhile for us to reconsider our present no-action thoughts.


I look forward to our further correspondence on this subject.





-----Original Message-----
From: Joseph Reagle [] 
Sent: Tuesday, March 05, 2002 3:22 PM
To: Bob Atkinson;;
Subject: Re: Schema Centric Canonicalization algorithm



I've had a chance to quickly review of the whole of the document and these 

comments augment/qualify those that I made earlier [1]. (And these are 

relevant to some tweaks we could make to exc-c14n, so I welcome xmldsig 

comments too!)



Received on Tuesday, 12 March 2002 18:27:45 UTC