Re: ACTION-519: review c14n 2.0 draft from Frederick Hirsch on 2010-04-12 (public-xmlsec@w3.org from April 2010)

From: Frederick Hirsch <frederick.hirsch@nokia.com>
Date: Mon, 12 Apr 2010 18:07:45 -0400
To: ext Scott Cantor <cantor.2@osu.edu>
Cc: Frederick Hirsch <frederick.hirsch@nokia.com>, "public-xmlsec@w3.org" <public-xmlsec@w3.org>
Message-Id: <3AF74D00-D51A-4A7C-8098-98BA305D402C@nokia.com>
+1 to editorial comments

also

section 2.5: s/locaName/localName/g
fix sotd to not duplicate standard boilerplate produced by ReSpec

Some of the discussion points:

2.2

> In serialization description, should "signed" be "canonicalized"?

this seems reasonable

2.5
> Also, I think Ei in that last bit should be Ej?
I think so also.

> In the sequential text, is sorting the URIs well-defined? Do we need  
> a formal reference on that? Cue rathole...3,2,1

Probably not if you start talking about equivalent URIs, but it seems  
to read sort the strings of the URIs lexigraphically, probably  
enough.  Maybe need a clarifying sentence that this is without concern  
for URI equivalence. (enjoyed the "cue rathole, 3,2,1" comment)


We should discuss the non-editorial points:

- define streaming model in 2.1, also pre-computing

- combining xmlXAncestors

[[
Regarding xsiTypeAware, I would still like to see this expanded to  
something at least a little more generic and just allow a list of  
qualified node names to treat as QName-valued. Or perhaps leave  
xsiTypeAware and just add a separate parameter for this, if it's  
important for conformance to make this one MTI but not the other.  
Speaking for myself, I don't know that I would want to implement  
prefix rewriting, but I really could use the ability to handle QNames  
in other places.
]]
2.3 why sort inclusion list
additional processing for namespaces?
2.5
why support digest rewrite (I don't remember the rationale of digest  
based re-writing, is the intent that it is easier, no need to maintain  
a counter?)

regards, Frederick

Frederick Hirsch
Nokia



On Apr 9, 2010, at 2:04 PM, ext Scott Cantor wrote:

> Comments on the March 4th draft:
> http://www.w3.org/TR/2010/WD-xml-c14n2-20100304/
>
> Mostly grammar and wordsmithing. I would be willing to just do a  
> pass for a
> lot of them.
>
> Abstract:
>
> Suggest rephrasing "incorporates an update to Exclusive..." to  
> explain that
> it's now a single algorithm for both. Perhaps reword first sentence  
> to say
> that it's a major rewrite of both Canonical XML 1.1 and Exclusive  
> Canonical
> XML 1.0.
>
> Sec 1.3:
>
> Is the last sentence still operational? It suggests XML-INFOSET is  
> "under
> development" but the reference seems to be to a W3C Rec.
>
> Sec 1.4:
>
> Suggest s/most/many, just to avoid over-promising.
>
> Sec 1.4.1:
>
> s/nodeset based/nodeset-based
>
> s/cannot be solved/cannot be addressed
> s/edge use cases/edge cases
> s/input of the c14n alg/input to the c14n alg
>
> s/Nodeset/nodeset
> s/spec/specification
> s/it only visits/only visits
>
> Sec 1.4.2:
>
> Reword:
> A streaming implementation is required to be able to process very  
> large
> documents without holding them all in memory; it should be able to  
> process
> documents one chunk at a time.
>
> Sec 1.4.3:
>
> s/breakages/breakage
>
> s/Remove leading/Optionally remove leading
> s/content especially/content, particularly
> s/Rewrite/Optionally rewrite
>
> Sec 1.4.4:
>
> s/depend a/depend on a
> s/makes it very hard/increases the work required for
> s/also it/it also
>
> Sec 2.1:
>
> The text says what the DOM Model XML subset is, but not the  
> streaming case.
> I'm actually somewhat unclear on what they formally are in any case.  
> Is En
> intended to connote an actual DOM node in a DOM tree? If so, I would  
> clarify
> that. If this isn't DOM, what is the equivalent for SAX? Is it  
> essentially
> an XPath that you have to dynamically evaluate as you go?
>
> s/In a DOM model/In the DOM model
> s/expressed as/expressed as:
>
> s/If out of this list/(If out of this list
> s/desclaration/declarations
>
> Regarding the sentence on computing the XML subset, does this  
> wording risk
> implying that implementations need to actually pre-comnpute the  
> subset?
> Isn't that at odds with the goal of performing a simple tree walk?  
> Should it
> be reworded in more explanatory terms as "the XML subset consists  
> of..."?
>
> s/allow a high/allow for a high
> s/allowing the essential/supporting the most essential
> s/Specifically/Specifically:
>
> s/purposely does not/does not
> s/re-inclusion, i.e./re-inclusion; i.e.,
> s/Think of it as a/It is effectively a
> s/Reinclusion/Re-inclusion
> s/attributes inheritance/attribute inheritance
>
> s/
> Exclusion is very limited, only complete subtrees and attribute  
> nodes can be
> excluded, other kinds of nodes like text nodes, comment nodes, PI  
> nodes
> cannot be excluded.
> /
> Exclusion is limited to complete subtrees and attribute nodes. Other  
> kinds
> of nodes (text, comment, PI) cannot be excluded.
>
> s/
> Even attribute exclusion is limited, namespace declaration and  
> attributes in
> XML namespace cannot be excluded.
> /
> Attribute exclusion is also limited, such that namespace  
> declarations and
> attributes from the xml namespace cannot be excluded.
>
> s/1.x mode but not in this new model/1.x, but not in this version
>
> Section 2.2:
>
> s/
> Instead of separate algorithms for each variant of canonicalization,  
> this
> specification goes with the approach of a single algorithm, which does
> slightly different things depending on the parameters.
> /
> Instead of separate algorithms for each variant of canonicalization,  
> this
> specification takes the approach of a single algorithm subject to a  
> variety
> of parameters that change its behavior to address specific use cases.
>
> Insert a sentence before the table:
>
> The following is a list of the logical parameters supported by this
> algorithm. The actual serialization that expresses the parameters in  
> use may
> be defined as appropriate to specific applications of this  
> specification
> (e.g., the <ds:CanonicalizationMethod> element in [XMLDSIG-CORE2]).
>
> In trimTextNodes description, s/nodes descendants/node descendants
>
> In serialization description, should "signed" be "canonicalized"?
>
> Would it be clean enough (and simpler) to collapse the the  
> xmlXAncestors
> parameters into a single parameter and just apply "combine" to only
> xml:base? Is there a need to use different rules for different  
> attributes?
> Seems like the various "modes" sort of go together given how the  
> earlier
> algorithms work.
>
> Regarding xsiTypeAware, I would still like to see this expanded to  
> something
> at least a little more generic and just allow a list of qualified  
> node names
> to treat as QName-valued. Or perhaps leave xsiTypeAware and just add a
> separate parameter for this, if it's important for conformance to  
> make this
> one MTI but not the other. Speaking for myself, I don't know that I  
> would
> want to implement prefix rewriting, but I really could use the  
> ability to
> handle QNames in other places.
>
> s/The defaults are set to result in canonical 1.1 with no comments/The
> defaults are chosen for equivalence to Canonical XML 1.1 with comments
> ignored.
>
> I assume the "named parameter sets" part is TBD, and we need to  
> decide what
> the sets are and what the MTI options are. Do we have somebody  
> willing to
> make a proposal on that? I guess I would be willing to define  
> something I
> could see using in profiles I'm involved with.
>
> Section 2.3:
>
> s/conisting/consisting
> s/exlusion/Exclusion
>
> Forgive my ignorance (I haven't ever implemented c14n), maybe I'm
> overlooking the obvious...but is it necessary or even desirable to  
> sort the
> inclusion list or detect children of other nodes up front? Can't  
> that be
> derived on the fly to avoid more than one tree walk? e.g. do a  
> traversal and
> switch "on" when an element is a hit in the hash list, pull out  
> descendents
> in the list as you find them, etc.
>
> I know we want to be abstract about implementation, but at the same  
> time we
> may be getting back into the problem of naïve implementations.
>
> s/
> While traversing if the current node is an element, and that element  
> is in
> the exclusion list
> /
> While traversing, if the current node is an element and that element  
> is in
> the exclusion list
>
> s/Element nodes/Element Nodes
> Under Element Nodes, s/should have written/will be written
>
> re: Namespace Nodes, is it really true that no additional processing  
> is
> involved? I think we need more text here referencing the processing  
> that
> determines whether anything gets output. I think you just mean that  
> *if*
> it's output, it's done in the same way as attribute nodes.
>
> Under Text Nodes, s/declaration is in context/declaration in context,
> s/
> In that case be careful when trimming the leading and trailing space  
> - the
> net result should be same as if it the adjacent text nodes were  
> concatenated
> into one
> /
> Be aware when trimming whitespace in such cases; the net result  
> should be
> equivalent to doing so as if the adjacent text nodes were  
> concatenated.
>
> At the end of the section, s/xml models like DOM/XML models such as  
> DOM
>
> Sec 2.4:
>
> I would consider moving this section up into section 1. It seems like
> motivating material for the overall package of features, and could  
> even be
> supplemented by additional sections that motivate some of the other  
> options
> if we're so inclined.
>
> Sec 2.5:
>
> Per my earlier comment, I think we need a reference to this section  
> in the
> main processing rules to provide context.
>
> s/special node/special node type or indication
> s/Attribute/attribute
>
> As a general comment, I'm not sure it's helpful to distinguish
> Explicit/implicit here, but if we did, I think the key point is not  
> that
> some DOM serializers will add them for you but that the DOM itself  
> will not
> include them when the node is created. I think you're trying to say  
> that
> implementations need to account for this, but if that's the case, we
> probably would need to reference the distinction somewhere in the  
> processing
> rules, and I don't see that now. Maybe you just need to add language
> referring to "both explicit and implicit" in some of the later text.
>
> Under Visibly utilized, clarify that the bullets are OR conditions,  
> maybe
> just say "if any of the following hold:"
>
> In step 2, s/any of the namespace declaration/any of the namespace
> declarations
> s/E j/Ej
> Also, I think Ei in that last bit should be Ej?
>
> s/If the prefixRewrite is specified/If the prefixRewrite option is  
> set to
> other than "none"
>
> In the sequential text, is sorting the URIs well-defined? Do we need a
> formal reference on that? Cue rathole...3,2,1
>
> Silly question...do we really need the complexity of digest-based  
> rewriting?
> If we do, is there a simpler way? Maybe just hex-encode the digest  
> octets?
> Yes, it's a bit longer, but it's also faster and easier...
>
> Didn't exactly follow the second note about exclusive c14n and the
> rewriting. That doesn't seem likely given that exclusive doesn't  
> change the
> fact that you only output it once for a given subtree...what's the  
> case
> being worried about here?
>
> Section 2.6:
>
> s/consist of/consists of the following steps:
>
> s/If E is an apex node examine/If E is an apex node, then examine
> s/not already there/not already present
> s/temporily/temporarily
> s/parametes/parameters
> s/inherit/"inherit"
>
> Should we reword "all element nodes along E's ancestors" to  
> something like
> "all ancestor element nodes of E"?
>
> s/combining then two/then combine them two
>
> Add forward reference to the join function in sec 2.7.
>
> s/the join/then join
>
> s/
> Sort all the attribute lexicographically (increasing)
> /
> Sort all the attributes in increasing lexographic order,
>
> Replace informal "if prefixes are rewritten" with a reference to the  
> option
> being other than "none".
>
> Section 3:
>
> Reword:
> Exclusive Canonicalization may be used as a canonicalization  
> algorithm in
> XML Digital Signature [XMLDSIG-CORE2], via the
> <ds:CanonicalizationAlgorithm> element.
>
> Identifier:
> ...
>
> Canonical XML 2.0 supports a set of parameters, as enumerated in  
> Section
> 2.2. All parameters are optional and have default values. When used in
> conjunction with the <ds:CanonicalizationMethod> element, each  
> parameter is
> expressed with a decicated child element. They can be present in any  
> order.
> A schema definition for each parameter follows:
>
> In the schema, I believe NMTOKENS is the wrong type for the prefix  
> list.
> That was the error made in the old spec and had to be fixed in errata,
> because #default isn't a legal NMTOKEN. The type should be a list of
> strings.
>
> Section 4:
>
> I didn't review this heavily yet.
>
> In section 4.1, the sort process is somewhat unclear to me. It seems  
> like it
> would take a full tree walk, and since I can't think how the inputs  
> in the
> DOM case could be other than logical pointers to actual DOM nodes, I  
> don't
> see why would need to sort them ahead of time. SAX is different, but  
> the
> sorting is clearly implicit there, right?
>
> Section B.1
>
> Is C14N 1.x a normative reference? Probably informative, no? Same  
> for the
> XPath Filter?
>
> Section B.2?
>
> Are URI and XMLBASE normative?
>
>
Received on Monday, 12 April 2010 22:08:19 UTC