- From: <noah_mendelsohn@us.ibm.com>
- Date: Tue, 1 Oct 2002 12:10:43 -0400
- To: Marc Hadley <marc.hadley@sun.com>
- Cc: mgudgin@microsoft.com, Rich Salz <rsalz@datapower.com>, xml-dist-app@w3.org
Marc Hadley writes: >> Should signatures that include such >> header blocks break when an intermediary >> removes env:mustUnderstand="false" ? I think either answer is coherent and potentially useful, but the one I had in mind was just the envelope infoset. If you let me toggle the physical presence of mU attributes, then I have (an admittedly very clumsy) covert channel available. You and I can make an agreement to signal information by its coming and going. Especially since it's also easier to reason about and explain the rules for "you either have the same infoset or you don't...if you do, the signature matches", I think that 's a good one to consider. Not too that we are not inventing the signature here, just talking about getting a bit of variability out of the processing model with the justification that it might allow others to do the signature. >> Digests/checksums work on bits and bytes, not >> abstract infosets. Not necessarily. I have built and deployed systems that checksum abstract interfaces. I believe my earlier note gave the appropriate definition of such a signature over the infoset: "we merely need a checksum that is the same whenever the infoset is the same, and with very high probability is different when the infoset is different." Now, at some level most implementations of such signatures will indeed involve inventing little bits of infoset encoding, not necessarily serialized into one giant stream, that represent every piece of information in the infoset so it can be hashed together to form the signature. I think that's what you mean by a canonicalization, but that's not the term I would use: a canonicalization is a many to one mapping. In this case, we more likely have a 1-to-1 mapping of the information in an infoset into a code that can be checksummed. There are no two infosets in this model that "canonicalize" to the same reprsentation or that get the same signature. For this reason, it's just an implementation detail how you actually build up the signature. I claim it is manifestly possible and practical to invent codes with the characteristic described (I.e. same infoset==> same code, different infoset ==(high probability)==> different code), and how you do it is not what's important here. I furthermore suggest that such checksums implement a semantic that will be very comprehensible to and useful to users: "the same envelope is OK, any change is an error". End of story. As I say, I have built and deployed systems that use essentially this approach to checksumming a set of similarly abstract information (turns out it was a set of declarations in the Pascal programming language), we did it using exactly the technique described above, and it worked well for users. So, I think there's a really simple rule that facilitates doing this sort of thing: intermediaries should not make gratuitous changes to the Envelope infoset. I'm for this reason against the rewriting of mU attributes, and against the removal and insertion of empty <Header> elements. As I said in my note to Gudge, my fallback position would be to go completely the other way: to enable both removal and insertion of such equivalent forms. Allowing only removal seems to me to have some of the disadvantages of both approaches. I can concur with a WG decision that goes either of these two ways, but I think the "don't mess with it" approach is stronger architecturally. Thanks. ------------------------------------------------------------------ Noah Mendelsohn Voice: 1-617-693-4036 IBM Corporation Fax: 1-617-693-8676 One Rogers Street Cambridge, MA 02142 ------------------------------------------------------------------
Received on Tuesday, 1 October 2002 12:14:00 UTC