- From: <noah_mendelsohn@us.ibm.com>
- Date: Fri, 27 Dec 2002 14:17:27 -0500
- To: Steven Taschuk <staschuk@telusplanet.net>
- Cc: www-xml-schema-comments@w3.org
Steven Taschuk writes: > > > Trolling through the archives, I find a suggestion that > > > canonicalization is useful in the context of signed > > > XML [...] > > > > Hard to comment without seeing the note in question. [...] > > Fair enough. I refer to "XML Schema and the necessity for > canonical representations", <dee3@us.ibm.com>, 1999-05-21: > <http://lists.w3.org/Archives/Public/www-xml-schema-comments/1999AprJun/0060.html> > > I gather that that note was written fairly early in the > process, to argue for the need for canonical > representations in the first place. Digital signatures > are just one example of an application for which > canonicalization issues are important; others certainly > exist, and I have no particular stake in signatures > specifically. Don was not a member of the workgroup at the time he wrote (or ever as far as I know.) I believe he was providing input reflective of one set of concerns, namely that DSIG is among the situations in which certain types of c14n transformations might be useful. I think it's fair to say that these were NOT the reasons that actually led to the inclusion of canonical forms in the schema rec. My opinion is that the reasons were closer to the ones expressed in my earlier note. I think we felt that it would be for the security community to gather requirements as to what should and should not be signed for various applications of DSIG, and we certainly did not go through such a requirements process. > XML Schema implies a model of what XML documents > consist of; Really? I think you would find a lot of disagreement, at least from some members of the schema team. Schema provides a definition of the assessment relation, which allows you to determine if an element is valid per an element declaratin or complex type. Schema's model of the document is the input infoset, which is character based. Schema makes available certain additional information as a byproduct of assessment, including indication of whether attributes were defaulted and if so with what value, etc. This additional informaion is placed into the so-called PSVI. Schema does not directly include "values" from the datatypes value space in the PSVI. (Though I agree that such values are in all cases determineable, and I think it would have been coherent to include them in the PSVI.) I would say that Schemas is careful NOT to give you a new model of the document, though it surely creates building blocks from which such models could be derived. Would such a model include defaulted attributes? Maybe. Equivalence of 100 and 1E+2, maybe. Other groups such as Query have looked into building data models based on the information available from schema assessment, but the schema WG certainly did not define such a model, in my opinion anyway. > I feel it is desirable to be able to write such a > canonicalizer for the equivalence relation under which > documents are equivalent if they differ only in ways > not reflected in that model. Among other things, this > includes the use of alternative lexical representations > for the same value. Well, I think it is possible to define and implement many such canonicalizers. The question is: which ones should be standardized and by whom. > Now, how should such a canonicalizer canonicalize > representations of user-defined simple types? A naïve > implementation would apply algorithms appropriate for > the built-in types from which they are derived -- if > this approach were sound, it would have the merit of > being applicable to any simple type whatsoever > (provided schema information were available). My > onTheHour example, however, shows that this approach > can generate "canonical" documents that are not > schema-valid. I think that I have acknowledged that the WG is well aware of the concern that certain user-defined types have no canonical form, and that there are applications of XML schema (such as DSig), for which this state of affairs is a compromise. Someone else in the WG will have to remind me of exactly where we stand in considering this, but I believe it was or is being reviewed as a possible concern for either 1.1 or 2.0 (should we decide to build such versions.) ------------------------------------------------------------------ Noah Mendelsohn Voice: 1-617-693-4036 IBM Corporation Fax: 1-617-693-8676 One Rogers Street Cambridge, MA 02142 ------------------------------------------------------------------
Received on Friday, 27 December 2002 14:21:21 UTC