Re: Decryption Transform processing question from merlin on 2002-05-29 (xml-encryption@w3.org from May 2002)

From: merlin <merlin@baltimore.ie>
Date: Wed, 29 May 2002 21:06:08 +0100
To: reagle@w3.org
Cc: "Takeshi Imamura" <IMAMU@jp.ibm.com>, xml-encryption@w3.org
Message-Id: <20020529200608.439894432D@yog-sothoth.ie.baltimore.com>
r/reagle@w3.org/2002.05.29/15:27:34
>On Wednesday 29 May 2002 02:59 pm, merlin wrote:
>> >1. If more than one "layer" of decryption is required (to peel back
>> >super-encrypted elements) then multiple decryption transforms should be
>> >used, with the input of a later transform being the output of the
>> > preceding decryption transform.
>>
>> This is what I advocate. That is, a single decryption transform
>> operates on all the unexceptional EncryptedData elements in
>> its input node set in a single phase, without iteration.
>
>Ok, perhaps we all agree on this then? This is what I believe the spec 
>currently specifies: decryptIncludedNodes() selects the e's it operates on 
>at the start, so it wouldn't find new e that were super-encrypted and 
>decryptXML doesn't recurse.

The spec currently states that you iteratively:
  "Within X, select e, an element node with the type xenc:EncryptedData
  such that it is not referenced by any dcrpt:Except elements".
Then, it decrypts e, wraps, parses, and replaces X with Y
(the new document) and loops. X is now a new document which
may include super-encrypted elements.

So this is stochastic ("an element") and iterative (X is
replaced with Y, loop until no more excepted elements).

>> If super-encryption has occurred then multiple decryption
>> transforms are required. The second decryption transform
>> will explicitly and clearly be operating in the context of
>> a dummy-rooted document and so can be correctly constructed
>> in this accord.
>
>I fear this is one of the parts I don't understand yet. Isn't that dummy 
>root removed? "Y is the node-set obtained by removing the root node, the 
>wrapping element node, and its associated set of attribute and namespace 
>nodes from the node-set obtained in Step 4. Return Y."

Right, but if you are constructing an XPointer then the
dummy node still exists from an XPath-processing perspective
so if you wish to write full XPointer exceptions, the form
#xpointer(/dummy/Blah) will be needed. I think this is a
rare and documentable case.

>> The first
>> element that is decrypted will have references and
>> exceptions resolved with respect to the initial document.
>> All subsequent will have then be processed within a dummy
>> document. The random nature of this gives rise to the
>> non-determinism for a single decryption transform.
>>
>> <Document>
>>   <SignedPart>
>>     <EncryptedData CipherReference=#data />
>>     <EncryptedData <Data Id='data' /> />
>>   </SignedPart>
>>   <Data Id='data' />
>> </Document>
>
>Ah, this has the potential to be a useful example. So I understand well, 
>"<Data Id='data' />" was the clear text encrypted? I'd also ask in what 
>order were they encrypted? It should be peeled back in the order in which 
>it was wrapped. If you're saying they were both available in the clear text 
>form then this was an invalid document.

It is, as I say, contrived. The inner Data element containing
arbitrary data is first encrypted (EncryptedData#2). Then
EncryptedData#1 is generated and its ciphertext is put in a
new Data element that happens to have the same Id as the one
that is now encrypted. So there never was an invalid document.
Now, run this through the decryption transform as currently
specified and the behaviour is indeterminate.

>> If I decrypt EncryptedData #1 first, then it will get its
>> data from the outer Data element. If I decrypt EncryptedData
>> #2 first, then EncryptedData #1 will be processed in a document
>> where only the new Data exists, so it will get its data from
>> there. This is contrived but stochastic. The rules I suggest
>> are deterministic: Both will be encrypted within the context
>> of this document during a single phase.
>
>achh, I'm loosing it again, but I'm getting a sense of where I'm loosing 
>it. I'm saying there should be as many decryption transform "unwrappings" 
>as there were encryption "wrappings." You seem to agree, but you're also 
>arguing that more than one element can be encrypted in a layer/wrapping? I 
>think I need your example played out over the steps to understand.

The problem is the iterative formulation of the current
decryption transform. Take that sample Document; one
possibility is, referencing the spec:

X is #SignedPart
R is empty

1. select EncryptedData#1
2. Applies
  1. Let C be parsing context of X
  2. Let Y be decryptXML (X, e, C)
       1. serialize X
       2. Decrypt EncryptedData#1 (will decrypt the outer Data element)
       3. Wrap this plaintext in C
       4. Parse this stream
       5. Let Y be the node-set obtained by removing the wrapping
     Let X be Y
3. Inapplicable
4. Go to 1, exits

This is just plain wrong; the output is just the result of
decrypting the first EncryptedData that we happened to select.

Now, I presume that the spec means to say in decryptXML(X,e,C):
  Convert X to an octet stream, but replace the octets of the
  element e and its children with the result of decrypting e
  and then wrap the resulting plaintext in C.

If this is the case, then we go:

X is #SignedPart as before
R is empty

1. select EncryptedData#1
2. Applies
  1. Let C be parsing context of X
  2. Let Y be decryptXML (X, e, C)
       . Decrypt EncryptedData#1 (will decrypt the outer Data element)
       . Serialize X, replacing e with this ciphertext and wrap in C
       . Parse this stream
       . Let Y be the node-set obtained by removing the wrapping
     Let X be Y
3. Inapplicable
4. Go to 1

at this point, we have X=
  (dummy)<SignedPart>
    <blah />
    <EncryptedData <Data Id='data' /> />
  </SignedPart>(/dummy)

1. select EncryptedData#2
2. Applies
  1. Let C be parsing context of X
  2. Let Y be decryptXML (X, e, C)
       . Decrypt EncryptedData#2
       . Serialize X, replacing e with this ciphertext and wrap in C
       . Parse this stream
       . Let Y be the node-set obtained by removing the wrapping
     Let X be Y
3. Inapplicable
4. Go to 1, exits

the output X is:
  (dummy)<SignedPart>
    <blah />
    <Data Id='data' />
  </SignedPart>(/dummy)

If we run this proess and happen to select EncryptedData#2 first,
then the result will be different; EncryptedData#1 will try and
decrypt the data from EncryptedData#2.

The formulation I propose first decrypts all the EncryptedData
elements in the context of the input document; then it
serializes X, replacing each EncryptedData element with its
plaintext, then it wraps this, parses it, strips it and that is
the result.

Merlin
Received on Wednesday, 29 May 2002 16:07:21 UTC