- From: Daniel LaLiberte <liberte@w3.org>
- Date: Tue, 21 Sep 1999 15:17:32 -0400 (EDT)
- To: "Tim Berners-Lee" <timbl@w3.org>
- Cc: "IETF/W3C XML-DSig WG" <w3c-ietf-xmldsig@w3.org>
Tim Berners-Lee writes: > I am worried that this "meaning of document closure" thread is > suggesting that signing parts should be construed as parts having > meaning in context. Here is my short version: 1. I agree that the meaning of parts of documents should not be assumed out of context. 2. I believe that a document part *can* be paired with a reference to its context. I think that is what a "document closure" is, if I understand how the term is being used here. 3. But, signing documents or parts of documents should have nothing to do with the meaning of the documents. We only sign the bits and bytes of the document, not the meaning. The signature itself may have meaning, but that is separate from the meaning of the document being signed. We can sign a package containing a document and a reference to its intended meaning, but that is different from the meaning of the signature itself. And the long version.... > Basically, logically, a document is a sentence in > a language, and we have applications which process documents > according to specs, and we say that documents have meaning. > > Parts of documents do not have meaning per se. I would say that parts of documents have meaning only in context. But even whole documents can have a different meaning depending on how they are referenced, and how they relate to other documents. The difference is only whether the data is immediately contained or externally referenced. This one difference seems to imply that there is also something more authoritative about immediate data, and hence the relationship to the signing. But just because you found some bytes next to other bytes doesn't by itself make them more authoritative. Rather, it is the *signing* of bytes, whether they are found together or apart, that gives us the authority to say that the two forms, together or apart, are equivalent. But in addition to having meaning, documents also are composed of bits and bytes, and this is the level at which signatures operate. Why does it matter to a signature what a string of bytes means? The difference between immediate vs referenced data is relevant for signing, because a referenced document must be dereferenced before its bytes can be signed, and the dereferencing introduces another risk. It may also be important to sign the reference itself, in the context of the document containing the reference. > It is for a trust system to determine the algorithm for defining what > can be inferred from a document signed with a given key. Right, but what can be inferred from a document (regardless of a signature) is different from what can be inferred from the signature of a document. > When we talk about signing parts of a document, then they only way > I can see of giving meaning to this is to say that we are signing a > some document which is not acutally given, but is formed by making > a particular transfortion on the document given. Transformations may be at various levels of syntax and semantics, depending on where you draw the line. But mere "extraction" of the bytes of a part of a document seems like a fairly straightforward syntactic operation. The only way it can get more complex is if the bytes of the part are somehow different depending on the context. > One can try to talk about the "semantics of a part of the document > in its context in the document" as much as one likes but one can only > define what it means by showing or defining that it is equivalent to some > other notional document. Showing semantic equivalence would be hard, but I don't think it is relevant to signing bytes. > Life is then simplified. A signature is over a document. But life often refuses to be so simplified. If a document is merely a representation of a resource, and the resource contains other resources, each with its own separate representations, then we are back to considering whether a part of a document that corresponds to a separate resource might be worthy of a signature. On a slightly different tangent, now that I am thinking about all this, how useful or necessary is it to sign parts of documents? I'd always thought of XML signatures as being contained in XML documents where the thing being signed was part of that very same document. Obviously, in this case, you don't want to sign the whole document including the signature because the signature would be partly a function of itself. So this assumes we can sign parts of documents from the outset. That doesn't mean it is necessary, however, since we could always sign something external to the XML signature document, referenced by a URI. But sometimes we want to be able to sign not only a document referenced by a URI, but the combination of the URI and the document, so that we know that neither has changed. In that case, we have to package up the URI and either the whole document, or a reliable hash of the document, and sign the package. That package could be a document with its own URI, but we would be back at step one if we relied on resolving the URI for the package. So it seems we MUST have signatures of anonymous packages. Is this true? > We don't have (here) to discuss what modifications may or may not be > made to a document later. A particular sentence has been > signed. According to the language, one may be able to deduce other > valid things and craete other believable documents by futher > manipulations but this spec doesn't have to worry about that. Belief in what a document says (its meaning) must be distinguished from the belief only that it has been signed by some authority. If the meaning of a signature provided by some authority is that everything it signs is true, and if you trust that authority, then you would probably believe in any document it signs. But not all signatures have the meaning that the thing signed is true. A notory public only asserts that some particular individual signed a particular document on some particular date, not that what the document says is even meaningful. > (By the way, I think of closure in the sense of the set of all objects > obtained by repeated application of an operation. That sounds like a transitive closure operation. Is it meaningful to distinguish a closure which includes references from a transitive closure with no (meaningful) references? > I expected the term to represent the repeated operation of finding > all dependent references within the document and signing them. Certainly in some cases you want to sign both a document (whether a part or a whole) together with things it is dependent on, whether or not those things are explicitly referenced by the document or implicit in some application context. You also sometimes want to sign a reference itself together with the document it resolves to (as discussed above). > Dependent references meaning something which affects the meaning and > you won't already know and trust. This is where it gets tricky. A dumb, generic signature function should not be required to figure out any of that, of course. Some appropriate higher-level application, together with your preferences and your web of trust, would decide what package of things should be signed. Given that, the meaning of documents is irrelevant to signing them at the level of the dumb generic signature function. Is your real concern that we will have problems writing those appropriate higher-level applications that attempt to understand some of the semantics of documents to determine what package of things needs to be signed? I agree it will be a challenge. -- Daniel LaLiberte liberte@w3.org
Received on Tuesday, 21 September 1999 15:17:34 UTC