RE: Alternative Canonicalization Draft from John Boyer on 2000-05-30 (w3c-ietf-xmldsig@w3.org from April to June 2000)

From: John Boyer <jboyer@PureEdge.com>
Date: Tue, 30 May 2000 09:33:50 -0700
To: <gregor.karlinger@iaik.at>
Cc: <w3c-ietf-xmldsig@w3.org>, "Joseph M. Reagle Jr." <reagle@w3.org>
Message-ID: <BFEDKCINEPLBDLODCODKAEAACDAA.jboyer@PureEdge.com>

Hi Gregor,

Answers within...

-----Original Message-----
From: w3c-ietf-xmldsig-request@w3.org
[mailto:w3c-ietf-xmldsig-request@w3.org]On Behalf Of Gregor Karlinger
Sent: Tuesday, May 30, 2000 12:24 AM
To: John Boyer
Cc: w3c-ietf-xmldsig@w3.org; Joseph M. Reagle Jr.
Subject: RE: Alternative Canonicalization Draft

Hello John,

> [mailto:w3c-ietf-xmldsig-request@w3.org]On Behalf Of Joseph M. Reagle
> Jr.

[...]

> John's alternative approach to C14N (in XPath instead of InfoSet context,
> though the results aren't that different aside from the changes we
> explicitly discussed) is at:
>
> http://www.w3.org/Signature/Drafts/WD-xml-c14n-20000601.html
>
> I think it would be best to get comments from folks in this WG before
> forewarding it to the attention of the larger community (though I'd like
> to do that ASAP).

following my comments and questions regarding your latest C14N draft.

* In section 4 you specify the text generation for the root node as
  follows:

  "Root Node- Nothing (no byte order mark, no XML declaration, no
   document type declaration)."

  XML 1.0 allows also comments and processing instructions to be child
  nodes of the root. Why are these nodes eliminated?

<john>
They aren't eliminated.  The text, PI and comment node children of the root
are processed in accordance with the rules for that node type.  Each has a
document order, so it will be processed as a function of processing the
node-set.
</john>

* In section 4, in the specification for namespace and attribute node
  text generation, I found the following sentence:

  "[...] and all whitespace characters (#x9, #xA, #xD, and #x20) with
   character references, except for #x20 characters with no preceding
   #x20."

  I am sorry I do not understand the last part of this sentence. Could
  you please give an example?

<john>
Though the sentence technically works, I now realize that it attempts to fix
a problem that doesn't exist.  The sentence fragment can therefore be
replaced by:

"[...] and the whitespace characters #x9, #xA, and #xD with
   character references."

The problem was simply that I wanted to retain any extra spaces that managed
to get included by character reference in the normalized value of non-CDATA
attributes.  I had thought that whitespace characters included by character
reference were not normalized out, but it is now clear to me that this is
true of most whitespace characters but not of actual space characters.  In
other words, sequences of #x9, #xA and #xD characters may appear but not
#x20 characters.  Therefore, in non-CDATA attributes, every #x20 character
has no preceding #x20 and so would be encoded as a #x20 character not as the
&#x20; character reference.  This is why the short description works.

John Boyer
Software Development Manager
PureEdge Solutions Inc. (formerly UWI.Com)
Creating Binding E-Commerce
jboyer@PureEdge.com

</john>

Regards, Gregor
---------------------------------------------------------------
Gregor Karlinger
mailto://gregor.karlinger@iaik.at
http://www.iaik.at
Phone +43 316 873 5541
Institute for Applied Information Processing and Communications
Austria
---------------------------------------------------------------

Received on Tuesday, 30 May 2000 12:33:50 UTC