Re: Decryption Transform processing question from Takeshi Imamura on 2002-07-05 (xml-encryption@w3.org from July 2002)

From: Takeshi Imamura <IMAMU@jp.ibm.com>
Date: Fri, 5 Jul 2002 17:21:47 +0900
To: merlin <merlin@baltimore.ie>
Cc: reagle@w3.org, xml-encryption@w3.org
Message-ID: <OF745ADA91.B612580E-ON49256BED.0018FCD4@LocalDomain>
Hi Merlin,

Thank you for the comments.

>>   O = foo(N, E)
>
>Maybe decryptTransform(N, E).

Joseph proposed "marshal".  Which do you prefer?

>>          where N is a node-set and E is a set of exception URIs held by
>>          URI attributes of dcrpt:Except elements. O is a node-set,
>>          computed as follows:
>
>I mildly prefer the separation of syntax (i.e., Except
>elements) and semantics (i.e., a set of exception URIs)
>adopted by exc c14n and xpath filter 2.0; however, this is
>really a minor point.

That is a good point, but if we do so, we should define each term (e.g.,
exception URI) explicitly.

>>         2. Let Y be [69]bar(N, X).
>
>Maybe decryptNodeSet(N, X)?

Joseph proposed just "decrypt".  Again, which do you prefer?

>We don't state the 'type' of Y.

We should do so.  As stated in bar(), Y is a set of node-sets and/or octet
streams.

>>         3. Convert N to an octet stream as described in [70]The
>>            Reference Processing Model (section 4.3.3.2) of the XML
>>            Signature specification [[71]XML-Signature]; but, in place of
>>            any decrypted xenc:EncryptedData element d and its
>>            descendants, if O[d] from Y is a node-set, convert it to an
>>            octet stream, and if it is already an octet stream, just emit
>>            it as is. Let C be the resulting octet stream.
>
>I wonder should this clarify that xenc:EncryptedData elements
>may be replaced from N or from some replacement node set O[d].

This sentence may be awkward, but I wanted to state that when a decrypted
EncryptedData element is replaced with a node-set, any decrypted
EncryptedData elements in the node-set are also replaced recursively.

>>         5. Let O, the output of this function, be a node-set converted
>>            from C as described in [75]The Reference Processing Model
>>            (section 4.3.3.2) of the XML Signature specification
>>            [[76]XML-Signature].
>>               o If parsing of C fails, then the implementation MAY
>>                 signal a failure of the transform. Alternatively, it MAY
>>                 also restart processing from the previous step without
>>                 replacing any decrypted xenc:EncryptedData element
>>                 causing a parsing error.
>
>I don't understand why this MAY text is here. Surely the
>implementation MUST signal a failure of the transform. It
>cannot reasonably determine which EncryptedData caused the
>parsing error (it will be parsing an octet streeam resulting
>from canonicalizing multiple node sets); and, even if it could,
>what's the reason for this?

It may not be so easy to determine such EncryptedData elements.  However, I
think that there are cases where all decrypted EncryptedData elements don't
have to be replaced for signature verification.  For example, consider a
case where there is an XPath filtering transform after a decryption
transform and unnecessary nodes, which may include EncryptedData elements,
are removed.  I didn't want to preclude such a case.

>>         1. Let D be a node-set containing all element nodes in N with
>>            the type xenc:EncryptedData that are not identified by any
>>            location-set in X.
>>               o If N is a node-set yielded by decrypting an
>>                 xenc:EncryptedData element, only location-sets in X
>>                 resulting from exception URIs with a full XPointer
>>                 "xpointer(id('ID'))" or bare name [[77]XPointer] are
>>                 considered.
>
>I think the second paragraph actually needs to go down below:
>The recursive call to bar() must be made with a new exception
>node set X because the original X will not identify any nodes
>in a decrypted node set.

I agree.  My text is incorrect.

>>               o If the Type attribute is absent or its value is neither
>>                 [82]&xenc;Element nor [83]&xenc;Content, the result is
>>                 an octet stream in default. However, the implementation
>>                 MAY process it further in accordance with its type, if
>>                 any, resulting in a node-set.
>
>I think that this is wrong, for the following reasons:
>  . If the encrypted data are serialized XML, the encryptor
>    should have used the correct type (Element or Content).
>    If they have not used this type; then, on the balance
>    of probability, the plaintext data are not serialized XML;
>    something else is intended.
>  . We have established that serialized XML matching the
>    rules for Element or Content cannot be used as a
>    plaintext replacement in the main canonicalization-
>    with-replacement operation; the node set must be augmented
>    with inherited XML attributes. This type of augmentation
>    cannot be performed on an octet stream.
>  . Essentially this breaks the Type system. While MimeType
>    is advisory, Type is not; it is encrypted XML's strong
>    typing system. For example, one of Joseph's standard Type
>    examples is CompressedXML. When decrypted without regard
>    for the Type, the result is serialized XML. However, this
>    is not what the result of decryption should be. The result
>    of decryption is the result of parsing this Compressed
>    XML, uncompressing the data it holds and returning the
>    resulting parsed node set, as specified in the definition
>    of the Compressed Type URI.
>
>Overall, I think that the decryptor MUST process the data in
>accordance with the value of its Type attribute and the result
>MUST be a node set. This is specifically a signature transform,
>and signature processing is uniformly defined in terms of node
>sets, so I think this is a reasonable restriction. In fact,
>by placing this restriction we make the transform much more
>powerful because it transparently handles new encryption Types
>such as CompressedXML, SerializedJavaNodeSet, etc. We cleanly
>punt the definition of this processing to the respective
>Type specifications.

I don't think that the decryptor can process all types of data and also has
to be able to process them.  So I think that it makes sense to specify the
processing rules in case that the decryptor cannot process a certain type
of data.

>>               o If decryption of any xenc:EncryptedData element fails,
>>                 then the implementation MAY signal a failure of the
>>                 transform. Alternatively, it MAY also skip such
>>                 xenc:EncryptedData element and continue processing.
>
>As before, I don't understand why it MAY proceed. This will
>mislead the recipient into thinking the data were modified
>when in fact a decryption operation simply failed. We have a
>defined exception mechanism that identifies what data should
>not be decrypted. Why have this second, different exception
>mechanism?

Encrypting parts of a document means modifying the document, so I don't
think that this misleads the recipient...  Anyway, as I wrote above, this
is because I didn't want to preclude a case where all decrypted
EncryptedData elements don't have to be replaced.

>>           2. Replace Y with Y )U {O[d]}.
>
>I like the union operator (I see it in mozilla/X11, Joseph) but
>I'm not sure it's right; I think Y is a map from EncryptedData
>elements to their replacement node sets. Using union might
>suggest you merge the node sets, which is not the case.

The union operation here does not merge two node-sets but two sets of
node-sets.  O[d] is a node-set, so {O[d]} is a set containing a node-set.

>>           3. If O[d] is a node-set:
>>
>>             1. Let Y' be [84]bar(O[d], X).
>>             2. Replace Y with Y )U Y'.
>
>I think that X should not be passed here, but X' which is
>computed by dereferencing the barename XPointer exceptions into
>the new node set. Further, because the new node set will not
>be validated, I think that we need to explicitly state that X'
>is constructed from only those EncryptedData elements with
>a matching Id attribute.
>
>This also means that E needs to get passed into this
>function. Also, we might do better to construct the map Y in
>the outer function and pass it into the inner function which
>will populate it.

So how about moving Step 1 of foo() into bar() and also changing the
arguments of bar() from N and X to N and E?  As to Y, I don't have any
objection.

>In the end, I'm not sure that the separation of functions
>necessarily makes things clearer.

Your original text treats EncryptedData element nodes in different
node-sets uniformly and that seems to me a little odd and confusing.  So I
divided the processing rules in two.

>Also, I think that we should /only/ dereference barename
>XPointers (#foo), and not full XPointers (#xpointer('foo')).
>  . There is no benefit to using the latter form in an
>    exception, so supporting them doesn't help anyone.
>  . In my implementation (and I presume others), I simply
>    pass the URI into my XPointer dereferencer. While I can
>    easily identify barename XPointers at this outer level,
>    I really don't want to have to parse the XPointer to
>    determine whether it matches # S 'xpointer' S '(' S
>    LITERAL S ')'.

There may not be any benefit, but to my understanding, the former form is
just a shorthand of the latter one, and so it seems only natural that we
should support the latter one, too.

>>     * Super-encryption may cause problems if a super-encrypted
>>       xenc:EncryptedData element uses same-document references, or if an
>>       exceptional super-encrypted xenc:EncryptedData element is
>>       referenced by a URI with an XPointer except a full XPointer
>>       "xpointer(id('ID'))" and bare name.
>
>Comment as above.
>
>As before, I would like a separate binary mode for this
>transform. I think that signing encrypted binary data is a
>fundamentally different operation; the input can never have
>been a plaintext node set, so I think this deserves its own
>sub-URI. If it throws an error on multiple EncryptedData,
>so be it; however, I think it should exist.

As I wrote in the previous mail, I have already agreed on this.

Thanks,
Takeshi IMAMURA
Tokyo Research Laboratory
IBM Research
imamu@jp.ibm.com
Received on Friday, 5 July 2002 04:22:15 UTC