- From: Scott Cantor <cantor.2@osu.edu>
- Date: Fri, 9 Apr 2010 14:04:49 -0400
- To: <public-xmlsec@w3.org>
Comments on the March 4th draft: http://www.w3.org/TR/2010/WD-xml-c14n2-20100304/ Mostly grammar and wordsmithing. I would be willing to just do a pass for a lot of them. Abstract: Suggest rephrasing "incorporates an update to Exclusive..." to explain that it's now a single algorithm for both. Perhaps reword first sentence to say that it's a major rewrite of both Canonical XML 1.1 and Exclusive Canonical XML 1.0. Sec 1.3: Is the last sentence still operational? It suggests XML-INFOSET is "under development" but the reference seems to be to a W3C Rec. Sec 1.4: Suggest s/most/many, just to avoid over-promising. Sec 1.4.1: s/nodeset based/nodeset-based s/cannot be solved/cannot be addressed s/edge use cases/edge cases s/input of the c14n alg/input to the c14n alg s/Nodeset/nodeset s/spec/specification s/it only visits/only visits Sec 1.4.2: Reword: A streaming implementation is required to be able to process very large documents without holding them all in memory; it should be able to process documents one chunk at a time. Sec 1.4.3: s/breakages/breakage s/Remove leading/Optionally remove leading s/content especially/content, particularly s/Rewrite/Optionally rewrite Sec 1.4.4: s/depend a/depend on a s/makes it very hard/increases the work required for s/also it/it also Sec 2.1: The text says what the DOM Model XML subset is, but not the streaming case. I'm actually somewhat unclear on what they formally are in any case. Is En intended to connote an actual DOM node in a DOM tree? If so, I would clarify that. If this isn't DOM, what is the equivalent for SAX? Is it essentially an XPath that you have to dynamically evaluate as you go? s/In a DOM model/In the DOM model s/expressed as/expressed as: s/If out of this list/(If out of this list s/desclaration/declarations Regarding the sentence on computing the XML subset, does this wording risk implying that implementations need to actually pre-comnpute the subset? Isn't that at odds with the goal of performing a simple tree walk? Should it be reworded in more explanatory terms as "the XML subset consists of..."? s/allow a high/allow for a high s/allowing the essential/supporting the most essential s/Specifically/Specifically: s/purposely does not/does not s/re-inclusion, i.e./re-inclusion; i.e., s/Think of it as a/It is effectively a s/Reinclusion/Re-inclusion s/attributes inheritance/attribute inheritance s/ Exclusion is very limited, only complete subtrees and attribute nodes can be excluded, other kinds of nodes like text nodes, comment nodes, PI nodes cannot be excluded. / Exclusion is limited to complete subtrees and attribute nodes. Other kinds of nodes (text, comment, PI) cannot be excluded. s/ Even attribute exclusion is limited, namespace declaration and attributes in XML namespace cannot be excluded. / Attribute exclusion is also limited, such that namespace declarations and attributes from the xml namespace cannot be excluded. s/1.x mode but not in this new model/1.x, but not in this version Section 2.2: s/ Instead of separate algorithms for each variant of canonicalization, this specification goes with the approach of a single algorithm, which does slightly different things depending on the parameters. / Instead of separate algorithms for each variant of canonicalization, this specification takes the approach of a single algorithm subject to a variety of parameters that change its behavior to address specific use cases. Insert a sentence before the table: The following is a list of the logical parameters supported by this algorithm. The actual serialization that expresses the parameters in use may be defined as appropriate to specific applications of this specification (e.g., the <ds:CanonicalizationMethod> element in [XMLDSIG-CORE2]). In trimTextNodes description, s/nodes descendants/node descendants In serialization description, should "signed" be "canonicalized"? Would it be clean enough (and simpler) to collapse the the xmlXAncestors parameters into a single parameter and just apply "combine" to only xml:base? Is there a need to use different rules for different attributes? Seems like the various "modes" sort of go together given how the earlier algorithms work. Regarding xsiTypeAware, I would still like to see this expanded to something at least a little more generic and just allow a list of qualified node names to treat as QName-valued. Or perhaps leave xsiTypeAware and just add a separate parameter for this, if it's important for conformance to make this one MTI but not the other. Speaking for myself, I don't know that I would want to implement prefix rewriting, but I really could use the ability to handle QNames in other places. s/The defaults are set to result in canonical 1.1 with no comments/The defaults are chosen for equivalence to Canonical XML 1.1 with comments ignored. I assume the "named parameter sets" part is TBD, and we need to decide what the sets are and what the MTI options are. Do we have somebody willing to make a proposal on that? I guess I would be willing to define something I could see using in profiles I'm involved with. Section 2.3: s/conisting/consisting s/exlusion/Exclusion Forgive my ignorance (I haven't ever implemented c14n), maybe I'm overlooking the obvious...but is it necessary or even desirable to sort the inclusion list or detect children of other nodes up front? Can't that be derived on the fly to avoid more than one tree walk? e.g. do a traversal and switch "on" when an element is a hit in the hash list, pull out descendents in the list as you find them, etc. I know we want to be abstract about implementation, but at the same time we may be getting back into the problem of naïve implementations. s/ While traversing if the current node is an element, and that element is in the exclusion list / While traversing, if the current node is an element and that element is in the exclusion list s/Element nodes/Element Nodes Under Element Nodes, s/should have written/will be written re: Namespace Nodes, is it really true that no additional processing is involved? I think we need more text here referencing the processing that determines whether anything gets output. I think you just mean that *if* it's output, it's done in the same way as attribute nodes. Under Text Nodes, s/declaration is in context/declaration in context, s/ In that case be careful when trimming the leading and trailing space - the net result should be same as if it the adjacent text nodes were concatenated into one / Be aware when trimming whitespace in such cases; the net result should be equivalent to doing so as if the adjacent text nodes were concatenated. At the end of the section, s/xml models like DOM/XML models such as DOM Sec 2.4: I would consider moving this section up into section 1. It seems like motivating material for the overall package of features, and could even be supplemented by additional sections that motivate some of the other options if we're so inclined. Sec 2.5: Per my earlier comment, I think we need a reference to this section in the main processing rules to provide context. s/special node/special node type or indication s/Attribute/attribute As a general comment, I'm not sure it's helpful to distinguish Explicit/implicit here, but if we did, I think the key point is not that some DOM serializers will add them for you but that the DOM itself will not include them when the node is created. I think you're trying to say that implementations need to account for this, but if that's the case, we probably would need to reference the distinction somewhere in the processing rules, and I don't see that now. Maybe you just need to add language referring to "both explicit and implicit" in some of the later text. Under Visibly utilized, clarify that the bullets are OR conditions, maybe just say "if any of the following hold:" In step 2, s/any of the namespace declaration/any of the namespace declarations s/E j/Ej Also, I think Ei in that last bit should be Ej? s/If the prefixRewrite is specified/If the prefixRewrite option is set to other than "none" In the sequential text, is sorting the URIs well-defined? Do we need a formal reference on that? Cue rathole...3,2,1 Silly question...do we really need the complexity of digest-based rewriting? If we do, is there a simpler way? Maybe just hex-encode the digest octets? Yes, it's a bit longer, but it's also faster and easier... Didn't exactly follow the second note about exclusive c14n and the rewriting. That doesn't seem likely given that exclusive doesn't change the fact that you only output it once for a given subtree...what's the case being worried about here? Section 2.6: s/consist of/consists of the following steps: s/If E is an apex node examine/If E is an apex node, then examine s/not already there/not already present s/temporily/temporarily s/parametes/parameters s/inherit/"inherit" Should we reword "all element nodes along E's ancestors" to something like "all ancestor element nodes of E"? s/combining then two/then combine them two Add forward reference to the join function in sec 2.7. s/the join/then join s/ Sort all the attribute lexicographically (increasing) / Sort all the attributes in increasing lexographic order, Replace informal "if prefixes are rewritten" with a reference to the option being other than "none". Section 3: Reword: Exclusive Canonicalization may be used as a canonicalization algorithm in XML Digital Signature [XMLDSIG-CORE2], via the <ds:CanonicalizationAlgorithm> element. Identifier: ... Canonical XML 2.0 supports a set of parameters, as enumerated in Section 2.2. All parameters are optional and have default values. When used in conjunction with the <ds:CanonicalizationMethod> element, each parameter is expressed with a decicated child element. They can be present in any order. A schema definition for each parameter follows: In the schema, I believe NMTOKENS is the wrong type for the prefix list. That was the error made in the old spec and had to be fixed in errata, because #default isn't a legal NMTOKEN. The type should be a list of strings. Section 4: I didn't review this heavily yet. In section 4.1, the sort process is somewhat unclear to me. It seems like it would take a full tree walk, and since I can't think how the inputs in the DOM case could be other than logical pointers to actual DOM nodes, I don't see why would need to sort them ahead of time. SAX is different, but the sorting is clearly implicit there, right? Section B.1 Is C14N 1.x a normative reference? Probably informative, no? Same for the XPath Filter? Section B.2? Are URI and XMLBASE normative?
Received on Friday, 9 April 2010 18:05:14 UTC