RE: Schema Centric Canonicalization algorithm from John Boyer on 2002-03-07 (w3c-ietf-xmldsig@w3.org from January to March 2002)

From: John Boyer <JBoyer@PureEdge.com>
Date: Thu, 7 Mar 2002 13:58:33 -0800
To: "Bob Atkinson" <bobatk@Exchange.Microsoft.com>, "Joseph Reagle" <reagle@w3.org>, "Donald Eastlake" <dee3@torque.pothole.com>, "Henry S. Thompson" <ht@cogsci.ed.ac.uk>, "Noah Mendelsohn" <Noah_Mendelsohn@lotus.com>, "David Beech" <David.Beech@oracle.com>, "Murray Maloney" <murray@muzmo.com>, "DeMartini, Thomas" <Thomas.DeMartini@CONTENTGUARD.COM>, "Wang, Xin" <Xin.Wang@CONTENTGUARD.COM>, "XML Signature" <w3c-ietf-xmldsig@w3.org>
Cc: "Maryann Hondo" <mhondo@us.ibm.com>, "Aissi, Selim" <selim.aissi@intel.com>, <uddi-wg@yahoogroups.com>, <uddi-security@yahoogroups.com>, "Allen Brown" <allenbr@microsoft.com>, "David Turner" <dturner@microsoft.com>, "Brian LaMacchia" <bal@microsoft.com>, "Barb Fox" <bfox@Exchange.Microsoft.com>, "M. Paramasivam" <parama@microsoft.com>, "Martha Nalebuff" <marthana@microsoft.com>, "James Utzschneider" <jamesu@microsoft.com>, "Chris Kaler" <ckaler@microsoft.com>, "Giovanni Della-Libera" <giodl@microsoft.com>, "Philip DesAutels" <philipda@microsoft.com>
Message-ID: <7874BFCCD289A645B5CE3935769F0B52328411@tigger.PureEdge.com>

I may have time to fully review your spec later, but in skimming through
it, one thing caught my eye immediately.
 
In the limitations, you state "Canonical XML contains a security hole
having to do with how it processes certain esoteric node-sets."  I would
think you should reword this in some way.  Canonical XML does not have
any security holes; it is a method for serializing any subset of the
nodes of an XML document.  *IFF* you choose to use the resulting
document subset in a security context, then there may be a security
problem.  
 
However, both in the example you give and in an infinitude of other
scenarios, the alleged security hole comes neither from C14N or the
surrounding security context (e.g. DSig), but rather from an
interpretation of the markup from which the document subset is drawn.
Your example about isolated attribute nodes has always been well-known
to me and is in the same class as the following:
 
Suppose you have a document 
 
<person>
    <home>  <addr>...</addr> </home>
    <business>  <addr>...</addr> </business>
</person>
 
and you select a node-set consisting of all <addr> elements and their
content descendants, attributes, etc.  The c14n of the node-set would be
something like:
 
<addr>...</addr><addr>...</addr>
 
What's wrong with this?  Well, we serialized the addr elements but we
have lost some of their meaning because the XPath expression used to
obtain the node-set failed to include nodes pertinent to the semantics
of the <addr> elements.  Your lone attribute examples are the same to
me.  If there is a node, attribute or otherwise, whose interpretation is
critically dependent on some other XML node, be it ancestor or sibling
or otherwise, then the signature author introduces a security problem
within his application by failing to write the XPath expression such
that the required nodes are included.
 
It makes no sense to write an XPath expression that excludes required
nodes then point out that there is a security hole in the serializer
because it does not serialize the nodes that were excluded.
 
Cheers, 
John Boyer

-----Original Message-----
From: Bob Atkinson [mailto:bobatk@Exchange.Microsoft.com]
Sent: Wednesday, February 27, 2002 3:41 PM
To: Joseph Reagle; John Boyer; Donald Eastlake; Henry S. Thompson; Noah
Mendelsohn; David Beech; Murray Maloney; DeMartini, Thomas; Wang, Xin
Cc: Maryann Hondo; Aissi, Selim; uddi-wg@yahoogroups.com;
uddi-security@yahoogroups.com; Allen Brown; David Turner; Brian
LaMacchia; Barb Fox; M. Paramasivam; Martha Nalebuff; James
Utzschneider; Chris Kaler; Giovanni Della-Libera; Philip DesAutels
Subject: Schema Centric Canonicalization algorithm



Gentlefolk,

 

As part of our work in defining the forthcoming v3 of the Universal
Description, Discovery, and Integration ( UDDI <http://www.uddi.org/> )
specification, we in the UDDI Working Group have found the need to
define a new XML canonicalization algorithm for use in XML digital
<http://www.w3.org/TR/xmldsig-core/> signatures. This algorithm, the
Schema  <http://www.uddi.org/pubs/SchemaCentricCanonicalization.htm>
Centric Canonicalization algorithm, attempts to work hand-in-glove with
the specification and semantics of XML Schema in order to produce a
canonicalized result whose bit string is sensitive only to the inherent
information in an infoset that affects its processing by
schema-assessment. Among many things, for example, sensitivity to
specific namespace prefixes is removed. The resulting algorithm is
particularly useful in maintaining digital signatures in applications
(such as ours) where XML is used as a communications protocol in front
of rich back-end data stores that must persist the data into a
relational or other non-XML store. As such, it is quite likely that this
algorithm will be of general interest to a considerable number of
applications, not just UDDI.

 

A draft of the specification of the Schema Centric Canonicalization
algorithm can be found at:

 

http://www.uddi.org/pubs/SchemaCentricCanonicalization.htm

 

At the present time, it is expected that the algorithm will be finalized
and published as part of UDDI v3 when that is completed later this year.
Between now and then, in order to increase the likelihood that the
algorithm will be as useful to ourselves and others as is possible, we
would be grateful to receive any comments, criticisms, or other feedback
that you or others might have. Please direct your comments to myself,
Maryann Hondo, and Selim Aissi.

 

Thank you.

 

                Bob Atkinson, Microsoft

                member UDDI Working Group

Received on Thursday, 7 March 2002 16:59:14 UTC