Canonical Form as non-XML from Joseph M. Reagle Jr. (W3C) on 1999-04-13 (w3c-xml-sig-ws@w3.org from April 1999)

From: Joseph M. Reagle Jr. (W3C) <reagle@w3.org>
Date: Tue, 13 Apr 1999 15:04:31 -0400
To: "Signed-XML Workshop" <w3c-xml-sig-ws@w3.org>
Cc: Paul Grosso <pgrosso@arbortext.com>, "Joel A. Nava" <jnava@adobe.com>
Message-Id: <3.0.5.32.19990413150431.00b9ae30@localhost>

[Note: I'm uncertain of the cross-posting and cc:'ing ettiquette, since
w3c-xml-sig-ws@w3.org is a public list and syntax is not, so I just cc'd the
authors of the original thread.]

Interesting thread in the syntax-WG that I wanted to port over here. I've
been advocating that  one be able to hash multiple represenations of the
data: the bits, some form of canonical XML, and additional representations
(such as DOM, or RDF, or a graph notation language, etc.) But I always
assumed the canonical XML would be XML. Obviously, it doesn't have to be.

My first reaction as to why I would want XML canonical XML was fuzzily
similar to the argument why we should do signed-XML at all -- instead of
just using S/MIME. People want XML data to continue to available to all XML
applications. For instance, somewhere in a document processing chain I might
have an application that needs to access the data within an XML structure
but it doesn't care about the signatures. Maybe the guy before or beyond him
does, but there is no necessary reason to render the content opaque to a
non-MIME savy XML application. So my first reaction was why make an
application understand another format which a non-signature XML application
might have problems with? But this is a half-thought. If an application will
canocilize, it will have an input and output regardless. Additionally, I
expect that you can never expect the XML on the wire to be canonical: the
proof is equal to the work, so you might as well do the work. Consequently,
I don't see a necessary reason as to why the canonical XML output must be
XML.

Comments?

Forwarded Text ----
 Date: Tue, 13 Apr 1999 11:44:15 -0500
 To: w3c-xml-syntax
 From: Paul Grosso <pgrosso@arbortext.com>
 Subject: RE: c14n [was: RE: Colons in attribute values]
 Status:   

 At 08:51 1999 04 13 -0700, Joel A. Nava wrote:
 >If we are testing processor conformance, does that mean that
 >the only kind of processors we can test spit out legal XML?
 >
 >I thought processors took in legal XML, and generated different
 >types of output from parsing the XML. In my way of thinking
 >then, the C14N form is needed as input to the processor.
 >What am I missing here?

 I cannot understand how given the canonical form as *input*
 to an XML processor makes any sense in terms of testing.

 The only way I know of testing conformance is to require that
 all processors emit the canonical form.  That is what RAST
 requires of all SGML processors as far as I remember.  Then,
 you hand a (most likely non-canonical!) document with a known 
 canonical form to a processor, and it must emit something that 
 is byte-for-byte equivalent to the known canonical form.  If
 the processor produces the correct canonical form for the complete
 test suite, it is considered a conforming processor.

 But note that this does not require that the canonical form be XML.

 As you point out, there is no requirement that an XML processor ever 
 *emit* XML.  In fact, the only emission/reporting requirements are 
 that they give errors as required by the spec.  A c14n effort must
 add a requirement that all XML processors conforming to the
 canonical testing spec also emit the canonical form.

End Forwarded Text ----
___________________________________________________________
Joseph Reagle Jr.  W3C:     http://www.w3.org/People/Reagle/
Policy Analyst     Personal:  http://web.mit.edu/reagle/www/
                   mailto:reagle@w3.org

Received on Tuesday, 13 April 1999 15:04:48 UTC