Re: Caononicalization Re: Minutes from Today's Call Please Review/Correct

From:  "Phillip M Hallam-Baker" <pbaker@verisign.com>
To:  "Donald E. Eastlake 3rd" <dee3@torque.pothole.com>,
            "IETF/W3C XML-DSig WG" <w3c-ietf-xmldsig@w3.org>
Date:  Wed, 25 Aug 1999 13:04:18 -0400
Message-ID:  <000801beef1b$de1ab400$6e07a8c0@pbaker-pc.verisign.com>

In-Reply-To:  <199908251556.LAA01878@torque.pothole.com>
>> >The definition of canonicalization is a function f(x) such that
>> >f(x) = f(f(x)).
>> 
>> No.  That's the fix point property, which is a desireable property of
>> canonicalization but by no means its definition.  
>
>Actually it is a required property for it to be a canonical form.

Your proof only works with the unstated assumption that the output of
the canonicalization function is of exactly the same syntactic and
semantic type as the input.  However, that is not how I used the word
and not how you have always used it since you said that DOM produces a
canonicalization of XML.  XML is a octet stream.  The output of DOM is
an abstract structure with a specified API.  The DOM canonicalization
function converts the octet stream to the structure.  DOM(DOM(x)) is
meaningless and therefor not equal to DOM(x) which is meaningful
(assuming x is XML).

>The point of canonicalization is to be able to test for equivalence
>between different syntactical representations of the same data.
>
>> My definition of
>> canonicalization is a function f(x) which is useful for application A
>> if f(x1) = f(x2) implies that application A considers x1 and x2
>> semantically identical. 
>
>And f(x1) <> f(x2) implies they are distinct. Therefore a c14n
>function must have the fixed point property.
>
>Lema
>Exists f(x) <> f(f(x))	for some x
>
>=> f(x) <> f(x2)	where x2 = f(f(x))
                                   ^^^^^^^ you just want f(x) here

>but x, x2 are semantically equivalent, therefore
>
>f(x) = f(x2), thus disproving the lema.

Above is your bug. f(x) is identically x2 but x isn't even necessarily
the same type.  No XML object is identical to the abstract structure
DOM converts XML objects to.  You are using "semantic equivalence" in
some sort of sense which is not the same as mathematical equality.
Perhaps this will be seen more clearing if you consider something I'll
make up called DOMstring.  DOMstring produces an octet stream but it
is the result of traversing the DOM tree in a specified way and
replacing all the name space labels with full URIs.  This is probably
a more reasonable example since the kinds of canonicalization (and
trasnformation) functions we are talking about produce octet streams
to sign.  DOMstring output is not syntactically correct XML.  Even
though the input and output of DOMstring are octet strings, you can't
apply DOMstring to the output of DOMstring, only to XML.  Thus f(f(x))
and f(x2) which appear in varous of your statements are all
meaningless and no matter how "semantically equivalent" x and
DOMstring(x) are, it does not follow that DOMstring(x) =
DOMstring(DOMstring(x)) since the left had side is an octet string and
the right hand side is a type error.  But if you look at the definition
of canonicalizaton I used above, you will find that, by that
definition, DOMstring is a fine canonicalization algorithm for DOM
oriented applications.

>Therefore all canonicalization functions are fixed point functions
>QED.

There is nothing particularly wrong with the unstated
assumption/definition you are using above for canonical, which makes
your proof true.  But "canonical" has not always been used with that
assumption in this working group.

Donald
===================================================================
 Donald E. Eastlake 3rd   +1 914-276-2668   dee3@torque.pothole.com
 65 Shindegan Hill Rd, RR#1  +1 914-784-7913(w)     dee3@us.ibm.com
 Carmel, NY 10512 USA

Received on Wednesday, 27 October 1999 11:47:58 UTC