Re: Three Issues for C14N Consideration

Hi Joseph,

>1. Presently, the C14N of XML includes expanded general entities which
>produces a standalone document. Obviously parameter entities aren't a
>problem, but PIs might be. Do you care about PIs?

We can't really know what a particular XML application language will do with
PIs, so there must be some way of preserving them in the signature.

>2. Should the canonical form include an XML declaration with version
number?

Based on my previous requirements feedback document, obviously the answer
should be yes.  As I said in the workshop, the XML header should be there
and it should include the version as well as the encoding information of the
document.  The elements actually required out of the document should not
have their encoding changed.  (this isn't actually the same as a
canonicalizer, but see below).

For any elements you must sign in a document, you not only need the internal
content but *** you also absolutely positively need the entire ancestor path
of information, including attribute declarations for namespace and so on***.

In general we need to be constructing the message to be hashed from the
document rather than just canonicalizing the particular elements listed in a
manifest while trying to come up with clever ways of inserting all of the
information we lose when we delete the ancestor elements.  Depth, order
relative to other included sibling elements, ancestor attributes and even
ancestor tags can alter the meaning of an element.  *Please see the prior
email I sent a few days ago.*

Ultimately, we need to be able to take the message actually signed and hand
it directly to an application that can process that type of document, and
ask it to show that document to the verifier since the only thing the signer
really signed was the rendered version of the document.  Few people on this
planet will be able to look at markup and say "yes I know exactly what all
of those pointy brackets mean and I commit to what this XML markup says".

>3. If C14N fails because of the unavailability of an external entity, do
>XML Signature applications want to know?
>

If I have one XML document in a file, and it references a second external
document, I should be able to create a version of the first document that
does not resolve the external references.  If our manifest includes the
first document and the second document, then the signature failure will be
of the first order.

Could we not then do whatever it is that we'd normally do during
verification if a resource listed in the manifest is not found?

For example, some applications will not care whether the resource is
available; they will only validate the signature on the manifest.  Other
apps will definitely care about signature closure.  The application can then
decide if there is an error independently of the signed XML spec-- as long
as we choose the right method of canonicalization!

ACTUALLY, WHAT I'M REALLY SAYING IS THAT WE ARE COMPLICATING MATTERS
UNNECESSARILY BY TRYING TO CANONICALIZE OUR DOCUMENTS AS PART OF XML
SIGNATURES.

We don't need a canonicalizer.  The canonical form of a document is the
selected way of writing all members of an equivalence class.  You're trying
to say that we want to take any member of an equivalence class and write it
in one way.  This may be possible to do given the rules of XML, but XML is
devoid of lexicon, so you really can't tell whether two documents are
equivalent until you put it in the context of a particular XML extension
language.  Since the idea is ultimately hopeless, why are we trying to do
this?

Seems like we could get off the ground more quickly by simply requiring that
the message generated by our 'writing algorithm' should not change the
meaning of the document.  The canonicalizers you are discussing would be
able to do this if they A) generated XML and B) generated the same actual
document type as the input.  These are signature requirements that I've
asked for repeatedly, but don't seem to be requirements of canonicalization
itself.

Canonicalization overlaps our purposes, and with careful consideration
canonicalization could be made to suit our purposes, but it is not
necessary.  If I have a signed document and someone modifies it, but it is
still logically equivalent, they've still modified it and so the darned
signature should break!  Really we're just trying to get around the fact
that hashes don't do equivalence classes, and why should we be doing that?
Once they signed it, we only need to be able to take the unaltered document
and perform a verify.  Since when has 'digital signature' come to mean that
a document can be modified in ways that don't affect its meaning?  No, a
document should not be modified, period.

>*. Next week the Syntax WG will discuss Don and Hiroshi's n1 vs. hash
prefix
>issue.

We would not need to do this in signed XML if we dropped the notion of
canonicalization from our requirements.  We just need a good writer of XML.
Perhaps someone would care to think about discussing this angle...

>Suppose we have PIs in the external DTD subset.  The document can
>still be standalone="yes", so in principle we can't put these in the
>canonical form unless *require* processors to read the external subset.
>So I see the following realistic options.  Let's hear opinions and
>nail this down next Wednesday.

W.R.T. generating a signature message, we don't need to generate a
standalone document.  We only need to generate a message that could be
processed later by an XML processor, which can make its own decisions about
the document, such as following the standalone flag.

My vote for PIs was only for those appearing in the document because I just
want to preserve the documents that someone is trying to sign.  If that
document relies on other documents, then put references to those documents
in the manifest.

After the cryptographic act of verifying the manifest signature, an
application that actually tries to regenerate the message and pass it off to
the appropriate rendering module can at that time test whether external
references in the message were included in the manifest.  If not, errors.
If so, then the unavailability of those external references percolates up to
the rendering device that would no doubt try to resolve the references as
part of reading the document.

Again, this is different from the canonicalization aspect because I'm not
really interested in canonicalization, only document reproducibility.

Received on Thursday, 17 June 1999 15:51:23 UTC