RE: Caononicalization Re: Minutes from Today's Call Please Review/Correct from Phillip M Hallam-Baker on 1999-10-27 (w3c-ietf-xmldsig@w3.org from October to December 1999)

From: Phillip M Hallam-Baker <pbaker@verisign.com>
Date: Wed, 27 Oct 1999 14:06:13 -0400
To: "Donald E. Eastlake 3rd" <dee3@torque.pothole.com>, "IETF/W3C XML-DSig WG" <w3c-ietf-xmldsig@w3.org>
Message-ID: <001c01bf20a5$f47898a0$6e07a8c0@pbaker-pc.verisign.com>

> There is nothing particularly wrong with the unstated
> assumption/definition you are using above for canonical, which makes
> your proof true.  But "canonical" has not always been used with that
> assumption in this working group.

My point being that if the specification is to be read by the
rest of the computer science community it had better use the
established terminology.

The working group can use whatever private language it cares
to invent. We can redefine as many terms as we like. If we want
the draft to be understood by the rest of the world we had better 
use their language.


Just because a function emits an output in canonical form does
not make it a canonicalization function. Such a function is
no more than a syntactical constraint on the output of the function.

If we were merely arguing about canonical output functions there
would be little argument. Such a function would only affect the
generator of the signature.

What people are objecting to is the unnecessary canonicalization
code you are requiring the verifier of the signature to write.
Not all verifiers will include a DOM parser. A large number
of verifiers will not even have access to the schema.

By specifying CANONICALIZATION you are creating a requirement that
a recipient of an XML message that is NOT in canonical form be
capable of converting it into canonical form. That requires access
to the schema since whitespace matters in some text encodings and
does not in others.


I don't think that this is merely a difference in terminology. I
think that you really want to impose this requirement on the
community. You find this difficult to justify however and so the
term 'canonicalization' is neatly redefined for purposes of
argument on the list but is clearly intended to retain its generally
accepted meaning in the specification.


I would expect that a typical Electronic Commerce system based on
signed XML messages would be built in layers. At the perimeter of
the installation there will probably be some type of proxy whose 
function it is to verify that messages are correctly signed, even 
if the structure of the messages themselves is not understood.
Such a proxy must be capable of verifying the signature on the
message even if the schema is not available.

The solution is simply to require that XML messages that specify
a canonical form be consistent with that canonical form. If a 
verifier receives a message that states it is in DOM-HASH canonical
form and attribute values are sent out of order so that it is
not, that the message should then be sent straight to the bit bucket.


Regardless of how great an idea you think it is, many applications
simply will not be able to implement your idea of 'canonicalization'.
The inevitable result being that they will ignore the parts of 
the spec that they have to to get their job done. Each time you
re-open this issue you further increase the risk of divergence.

Remember that XML improves on SGML by what it leaves out. Arbitrary
requirements such as the need to be able to re-sort all attributes
into alphabetical order before verifying a signature are exactly the
types of baroque obscurantisms that XML is meant to be leaving
behind.

	Phill

Received on Wednesday, 27 October 1999 14:05:01 UTC