- From: Thomas Maslen <tmaslen@wedgetail.com>
- Date: Wed, 09 Jan 2002 19:28:53 +1000
- To: w3c-ietf-xmldsig@w3.org
Very likely I just need a remedial reading comprehension class, but I came across a number of places in the current editors' copy (r 1.21) of the Exclusive XML Canonicalization spec where I ended up relying on my intuition about the intent of the spec because (as far as I could tell) details were either missing or inaccurate. Semantic issues: (1) "output parent" vs "output ancestor" Section 1.1 defines "output parent" and makes it clear that any non-apex node has exactly one output parent, which is its nearest element node ancestor in the node-set. "output ancestor" is not defined anywhere, although it's a reasonable guess that it means "any element node ancestor in the node-set". "output ancestor" is referred to in the second bullet item toward the bottom of Section 1.1. [And I believe that referring to it is correct, it just needs a definition too]. Exception 3 in Section 3 says "[...] a namespace declaration is output at every output element where that prefix is visibly utilized and an equivalent declaration is not made in an output parent." I believe this is wrong (inconsistent with the aforementioned second bullet item) and should actually say "output ancestor". Step 3.1 of the algorithm in Section 3 says "[...] it has not yet been rendered (ns_rendered) by an output parent". I believe this is wrong (inconsistent with the loose description of ns_rendered) and should actually say "output ancestor". A literal reading of the "output parent" wording in Section 3 would, I believe, exclusively canonicalize <a:e0 xmlns:a="silly"> <a:e1> <a:e2> <a:e3> <a:e4> <a:e5> <a:e6> <a:e7> </a:e7></a:e6></a:e5></a:e4></a:e3></a:e2></a:e1></a:e0> to <a:e0 xmlns:a="silly"> <a:e1> <a:e2 xmlns:a="silly"> <a:e3> <a:e4 xmlns:a="silly"> <a:e5> <a:e6 xmlns:a="silly"> <a:e7> </a:e7></a:e6></a:e5></a:e4></a:e3></a:e2></a:e1></a:e0> which I certainly hope isn't what was intended? (2) Where is the exc-c14n behaviour of the default namespace specified? The default namespace ("xmlns") has various funny properties that have to be dealt with in definitions, particularly - since the default namespace doesn't have a namespace prefix, phrases like "For namespace prefixes ..." don't apply to it, - since XPath very thoughtfully indicates xmlns="" by the absence of a namespace node, phrases like "each namespace node" don't do the job either. The Canonical XML recommendation jumped through the appropriate hoops to correctly define the behaviour of the default namespace (despite XPath), but I don't think that the exc-c14n draft does. Section 1.1 of exc-c14n is fine: the definition of "visibly utilizes" does have a sentence that accounts for the default namespace [well, assuming that it is *not* using XPath semantics, i.e. the incredible disappearing xmlns="" node]. Section 3 contains two definitions of exc-c14n, and I don't think that either of them really addresses the default namespace: - the first definition is "Canonical XML, with these three exceptions". The wording in the exceptions (2 and 3) talks about "namespace prefixes", so it doesn't include the default namespace -- so presumably the default namespace just inherits the Canonical XML behaviour, i.e. it uses inclusive c14n? Is that the intent? (I would have guessed that the default namespace was meant to be handled exclusively). - the second definition is the pseudocode algorithm. Step 3 of the pseudocode talks about "namespace nodes" in the XPath sense, so implicitly [accidentally? Or deliberately?] it applies to xmlns="mumble" and will treat it exclusively -- c.f. the first definition, above -- but it does not handle xmlns="" at all. I think that there are two options for the spec that would give consistent results: (I) state that the default namespace is always treated inclusively, i.e. effectively the InclusiveNamespaces PrefixList invisibly contains the default namespace (which, of course, doesn't have a prefix) (II) modify Section 3 (I haven't figured out how) so that both xmlns="mumble" and xmlns="" are canonicalized exclusively, i.e. they only show up when they are visibly utilized Of these, I definitely prefer (II), because I think it produces the less surprising behaviour. [Or is there something I haven't realized about exc-c14n that makes this all a silly question, e.g. element names are always prefixed?] Consistency & Clarity: (a) "apex node" is specifically defined for element nodes, whereas "orphan node" doesn't mention a node type. If this is deliberate, i.e. the definition applies to namespace and attribute nodes too, then it should be explicit rather than implicit. (b) "output parent" is undefined for apex nodes. Fair enough, but can this be stated explicitly? (c) "visibly utilizes" is defined. "utilizes" is not. Step 3.1 of the algorithm in Section 3 says "Render each namespace node iff it is [...] utilized by [...]". Did it mean "visibly utilized"? (d) The definition of "visibly utilizes" includes both the prefix (P) and the bound value (V), and talks of a "namespace declaration". The first two uses of "visibly utilizes" include the prefix but not the bound value. The third use of "visibly utilizes" [well, the one that just says "utilizes" at present] talk about the namespace node and the InclusiveNamespacePrefix List. These all seem rather inconsistent. My guess is that the definition should refer only to the prefix (P) and not mention the bound value at all. (e) A paragraph in section 1.1 says "The namespace axis of an element contains nodes for all namespace declarations [...]". If this is meant to be consistent with XPath semantics, it should mention the absence of a node for xmlns="". (f) Step 3.1 of the algorithm in section 3 says "Render each namespace node iff [...] it has not yet been rendered [...] by an output parent". This means the output parent of the namespace node, i.e. the element node. Is this really what was intended? Or did it really mean to say an output parent of the element node? (Even then, "an" doesn't make sense unless "output parent" is replaced by "output ancestor"). (g) The pseudocode is too pseudo. In particular, the offhand use of ns_rendered is much to vague -- having implemented this, I can guess what it really means, but the pseudocode doesn't define it for me. (h) The DTD, the schema and the example now consistently refer to an "InclusiveNamespaces" element with a "PrefixList" attribute. Good. However, the introductory text three lines above the example still refers to an "InclusiveNamespacePrefix" element with a "List" attribute, and six other places in the document also refer to an "InclusiveNamespacePrefix List". (i) Maybe just showing my ignorance... why does the DTD for the schema declare %p; and %s; and not use them? Likewise for &dec; (j) It might be helpful if the text made it obvious that it is always using the XPath semantics for namespace nodes (well, except in the definition of "visibly utilizes"), i.e. the namespace axis of an element includes all the namespace nodes from its ancestors (except for overridden bindings, and except for the absence of xmlns=""). The paragraph in section 1.1 says (most of) this, but a little reinforcement wouldn't hurt, particularly in section 3. Thomas Maslen tmaslen@wedgetail.com
Received on Wednesday, 9 January 2002 04:28:57 UTC