- From: Frederick Hirsch <frederick.hirsch@nokia.com>
- Date: Mon, 2 Feb 2009 17:46:13 -0500
- To: XMLSec WG Public List <public-xmlsec@w3.org>
- Cc: Frederick Hirsch <frederick.hirsch@nokia.com>
Canonicalization notes entered by Konrad into chat during 14 January F2F, related to ACTION-175 http://preview.tinyurl.com/C14n-Intro Some general stuff about C14n: 2.5 Canonicalization Canonicalizing XML is hard! Tim Bray To be able to digest XML we need a binary representation or serialization, because only a series of bytes (aka. octets) can be signed. Certain aspects of XMLs serial representation are left open and a canonical and reproducible representation is hence required. The goal of canonicalization is to remove any information, that is considered certainly insignificant and to define an unambiguous representation for aspects that can be represented in various ways. Such negibilities range from character encoding, line breaks, order of attributes, whitespace in tags and between attributes, unutilized namespaces to value normalizations based on a DTD or Schema. Higher forms of canonicalization include the more primitive ones. The following forms of XML canonicalization currently can be found in standards, drafts and other sources. They are presented here by their level of sophistication and ordered from simple to complex: * Minimal Canonicalization (MC14n) * Canonical XML Version 1.0 (C14n) * Canonical XML Version 1.1 (C14n11) fixing issues analyzed by us * and the XMLCORE working group (WG). * Exclusive XML Canonicalization Version 1.0 (Exc-C14n) * Schema Centric XML Canonicalization Version 1.0 (ScC14n) http://tinyurl.com/Why-C14n-is-inefficient : Namespace Nodes - A namespace node N is ignored if the nearest ancestor element of the nodes parent element [O] that is in the node-set and has a namespace node in the node-set with the same local name and value as N. Otherwise, process the namespace [. . . ] replacing this text with : Namespace Nodes - To process a namespace node [N] by find the first output ancestor element [A] of the nodes owning element [O] in reverse document order having an output namespace node [Na] with the same local name as [N] (declaring the same prefix) and [A] and [Na] are in the node-set. If [N] and [Na] have the same value [N] is ignored otherwise, process the namespace [. . . ] simple spec changes to c14n would help w/ namespace handling (ns handling is the big problem) consider adding some constraints on how nodes are connected in the input to C14N, that could help simplify things too there are always some types of nodesets that require that you keep all the namespace prefixes. Can't just use a simple stack model b/c of these edge cases this spec change targets the problems w/ canonicalizing namespace nodes https://online.tu-graz.ac.at/tug_online/voe_main2.getVollText?pDocumentNr=90836 #page=60 suggests that maybe there could be a C14N v1.2 that is smarter w/ handling namespace nodes Exc-C14n suffers not inheriting xml:base, xml:space, and other inheritable attributes. Exc-C14n however is good at processing namespace nodes C14n is bad at processing namespace processing klanz2: whitespace handling should be dropped in the general case. try to establish some principals on how information should be dropped when doing C14N https://online.tu-graz.ac.at/tug_online/voe_main2.getVollText?pDocumentNr=90836 #page=101 Be liberal in what you require but conservative in what you do Translated to XMLDSIG this means: Refer only to what is necessary, and canonicalize as much as possible by default! Saying something is application dependant or expensive is a mere excuse of engineers not trying hard to figure out to make it robust and efficient. Principles for designers of user agents such as browsers or XMLDSIG applications have to be proxy for their end users. OASIS-DSS allows them to do this centrally in office environments, but such should apply for decentralized application developers as well: * Signer, should be conservative in what they consider as being the Information they want to have secured. * Intermediaries, are invited to process signatures with whatever tools they find appropriate. Be conservative in what you have to touch for processing, especially do not touch signed documents and use opaque containers (subsection 3.2.3 on page 57). If yet available <xml> ... </xml> (subsection 4.1.1 on page 79). *Intermediaries and verifiers, do not touch what was meant to be signed, and hence has been signed or the signature breaks. * Verifiers, only what is signed (i.e. DigestInput) should be shown as signed or processed as signed. Balancing the trade-off between robustness, efficiency and simplicity can not mean only to resign and hide behind a Do not touch signed documents at all principle. This will hinder the spreading, processing and passing on of signed content, yes signed information entities that can be trusted, across the Internet. @best practices: It is good practice to use Exc-C14n only for connected node-sets and declare all used prefixes in the Best Practices InclusiveNamespacePrefixList. In general it is good practice to use Exc-C14n whenever possible, especially if applications use namespace prefixes only to qualify elements and attributes whose owning element is also in the document subset. Despite the fact that document sub-sets (node-sets) containing attributes and not their owning elements have a questionable semantic and hence should be avoided, they are nonetheless allowed in XPath and accepted by Exc-C14n. Such node-sets are however not suitable for Exc-C14n with respect to the definition of visibly utilized namespace declarations. Adding #default will assure the correct interpretation of QNames without prefix. from https://online.tu-graz.ac.at/tug_online/voe_main2.getVollText?pDocumentNr=90836 #page=60 regards, Frederick Frederick Hirsch Nokia
Received on Monday, 2 February 2009 22:46:55 UTC