Serialization and canonicalization

There has been mention of serialization and canonicalization transforms
in the discussion so far.  I don't understand the need for these.

Serialization would be needed if we were starting with a non-serial
representation of the data to be encrypted, such as a DOM tree or an
XPath node-set.  Isn't it adequate to presume that we are beginning with
an XML document in serialized form?  Serialization issues could then be
ruled out of scope for this effort.

Canonicalization is an issue for signature verification, where it is
desired that semantically-neutral changes to an XML signed document
could still allow the signature to verify.  This motivated the desire
to specify canonicalization algorithms for the XML signature effort.

However the issue does not arise in the same way for XML encryption.
Canonicalizing data before encryption would not aid in decryption,
as far as I can see.

The one transform which seems relevant to an encryption effort is
compression.  Compressing data before encryption is helpful for two
reasons.  First, encrypted data is not compressible, so compressing before
encryption is our only opportunity to do so.  Second, the compressed data
generally has less structure then plaintext data.  This can theoretically
make the encryption harder to break (but this is a weak effect, and
there are countervailing factors).

Are there reasons for continuing to consider serialization and
canonicalization issues?

Hal Finney
PGP Security

Received on Sunday, 12 November 2000 15:40:25 UTC