XML Decryption Transform #13

Or Part Thereof; a Suggestion

Authors/Editors:: Merlin Hughes, Baltimore Technologies Ltd., <merlin@baltimore.ie>

Status of this document

This document has no status; it is a suggestion proposed for possible consideration.

1. Specification

The decryption transform has two modes of operation: Binary mode and XML mode. Characteristics held in common by these two modes are described in the following sections, followed by an explicit specification of each mode.

1.1 Input and Output

The input required by this transform is an XPath node-set over the input document. If the input is an octet stream, then the application MUST convert the octet stream to an XPath node-set, as specified in XMLDSIG.

The output of the transform depends on its operation mode: The binary mode transform produces an octet stream. The XML mode transform produces a node set, from a different document to the input document.

1.2 Parameters

The parameter to this transform is a set of same-document exception URIs which, when deferenced in the context of the input document to the transform (i.e., not necessarily the same document as the transform itself), identify EncryptedData elements that should not be decrypted. These are data that were already encrypted when the signature was generated. Blah blah.

1.2½ Syntax

Blah blah blah encoding the set of exception URIs in Except elements.

1.3 Binary Mode

The binary mode of operation is intended for use when generating a signature over binary data that are to be encrypted for transmission to the recipient. Use of this mode of the transform allows a signature to be computed over the plaintext form of the data, rather than the opaque ciphertext. This further allows the ciphertext to be stored elsewhere, identified by a cipher reference, without the need for the signature to take this into account.

As described in section 1.1, the input to this mode of the transform is a node set, and the output is an octet stream.

1.3.1 Algorithm Identifier

The XML Signature Recommendation [XML-DSig] uses a URI [URI] to identify each algorithm to be performed when creating or validating a signature. The binary-mode decryption transform is identified as follows:

Algorithm Identifier: http://merlin.org/xml-decrypt#Binary

1.3.2 Processing Model of Binary-Mode Decryption Transform

The input to this transform is a node set N. The parameter to this transform is a set of exception URIs E. The output is a node set O, computed as follows:

Dereference each exception URI e, from E, in the context of the owner document of the input node set N, resulting in the location sets x_e.

If an exception URI fails to dereference any nodes then the resulting error MUST be ignored; it may be the result of part of the input document being encrypted.

Let the node set D be all element nodes in N, with the namespace URI &xenc; and local name EncryptedData, that are not identified by any location set x_e.
Decrypt each EncryptedData element d, from D, without regard for which, if any, of its descendants are in N, and without consideration of its Type attribute, resulting in the octet streams o_d.

If decryption of any EncryptedData element fails, then this transform MUST signal failure.

Let O, the output of this transform, be the concatentation of the octet streams o_d, ordered in the document order of d.
- If there are no EncryptedData elements in D, then the result is a zero-length octet stream.

1.3.3 Example of Binary-Mode Decryption Transform

Consider the following example signed document:

<Document>
  <xenc:EncryptedData Id="image" MimeType="image/png" ...>
    ...
    <!-- image data -->
    ...
  </xenc:EncryptedData>
  <dsig:Signature ...>
    ...
    <dsig:Reference URI="#image">
      <dsig:Transforms>
        <dsig:Transform Algorithm="http://merlin.org/xml-decrypt#Binary" />
      </dsig:Transforms>
      ...
    </dsig:Reference>
    ...
  </dsig:Signature>
</Document>

Much of the encrypted data and signature are elided; the implication of the comment in the encrypted data is that the encrypted content is a binary image.

Execution of the decryption transform will proceed as follows:

The input to the transform, N, is a node set containing the EncryptedData element and its children, less comments. The parameter to the transform, E, is empty.
As a result, D is a node set consisting of the one EncryptedData element, d_image. This is decrypted, resulting in an octet string o_image containing the plaintext of the binary image.
There are no other data to decrypt, so the result of this transform is the plaintext obtained in the previous step. This will be used directly as input to the digest algorithm.

1.4 XML Mode

The XML mode of operation is intended for use when generating a signature over XML data, parts of which may be encrypted by subsequent processing. Use of this mode of the transform allows the recipient to verify a signature over the XML data, automatically decrypting parts of the data that were encrypted after generation of the signature. Support is provided for identifying parts of XML that were encrypted when the signature was generated; and, with some limitations, for undoing super-encryption of parts of the signed data.

As described in section 1.1, the input to this mode of the transform is a node set, and the output is a node set.

1.4.1 Algorithm Identifier

The XML Signature Recommendation [XML-DSig] uses a URI [URI] to identify each algorithm to be performed when creating or validating a signature. The XML-mode decryption transform is identified as follows:

Algorithm Identifier: http://merlin.org/xml-decrypt#XML

1.4.2 Processing Model of XML-Mode Decryption Transform

The input to this transform is a node set N. The parameter to this transform is a set of exception URIs E. The output is a node set O, computed as follows:

Dereference each exception URI e, from E, in the context of the owner document of the input node set N, resulting in the location sets x_e.

If an exception URI fails to dereference any nodes then the resulting error MUST be ignored; it may be the result of part of the input document being encrypted.

Let the node set D₀ be all element nodes in N with the namespace URI &xenc; and local name EncryptedData, that are not identified by any location set x_e.
Decrypt each EncryptedData element d, from D_i (initially i is 0), without regard for which, if any, of its descendants are in N, and process it in accordance with the value of its Type attribute, resulting in the node sets o_d.

For example, processing of an EncryptedData with Type &xenc;Content or &xenc;Element is specified in XMLENC, §4.3.1, and the result is a node set.
If the value of any Type attribute is unknown, or if the result of its processing is not a node set, then this transform MUST signal failure.
If decryption of any EncryptedData element fails, then this transform MUST signal failure.

Let the set D_i+1 be all element nodes in any replacement node set o_d with the namespace URI &xenc; and local name EncryptedData, that are not identified by a barename ID reference from any exception URI in E.
Repeat steps 3 and 4 for i ← i+1 while D_i+1 is non-empty; i.e., as long as new, unexceptional super-encrypted EncryptedData are identified.
Canonicalize the input node set N according to C14N (with comments); but, in place of any EncryptedData element d, from any D_i, and its descendants, canonicalize the decrypted node set o_d. Let the resulting octet stream be C.

Canonicalization of replacement node sets must be augmented as follows:
- An xmlns="" declaration must be emitted with every apex element that has a null prefix and namespace URI.
- If a node set is replacing an element from N whose parent element is not in N, then its apex elements must inherit attributes from the xml namespace.
Note that the resulting octet stream may not be in canonical form.

Let O, the output of this transform, be the result of parsing C in accordance with XMLDSIG.

If there are no EncryptedData elements in D₀, then N is still canonicalized and reparsed.

1.4.3 Example of XML-Mode Decryption Transform

Consider the following example signed document:

<Document>
  <ToBeSigned Id="tbs">
    <Part number="1">
      <Data>...</Data>
      <xenc:EncryptedData Id="#secret-1" ... />
    </Part>
    <Part number="2">
      <Data>...</Data>
    </Part>
    <Secrets>
      <xenc:EncryptedData ... />
      <xenc:EncryptedData ... />
    </Secrets>
  </ToBeSigned>
  <dsig:Signature ...>
    ...
    <dsig:Reference URI="#tbs">
      <dsig:Transforms>
        <dsig:Transform Algorithm="http://merlin.org/xml-decrypt#Binary">
          <dcrpt:Except URI="#secret-1" />
          <dcrpt:Except URI="#xpointer(id('tbs')/Secrets/*)" />
        </dsig:Transform>
      </dsig:Transforms>
      ...
    </dsig:Reference>
    ...
  </dsig:Signature>
</Document>

Much of the encrypted data and signature are elided. The Except elements identify parts of the document that were encrypted when the signature was generated.

Consider, then, that this document is subsequently encrypted by various processes, resulting in the following:

<Document>
  <ToBeSigned Id="tbs">
    <xenc:EncryptedData Id="part-1" Type="&enc;Element" ... />
    <xenc:EncryptedData Id="part-2" Type="&enc;Element" ... />
    <Secrets>
      <xenc:EncryptedData ... />
      <xenc:EncryptedData ... />
    </Secrets>
  </ToBeSigned>
  <dsig:Signature ...>
    ...
    <dsig:Reference URI="#tbs">
      <dsig:Transforms>
        <dsig:Transform Algorithm="http://merlin.org/xml-decrypt#Binary">
          <dcrpt:Except URI="#secret-1" />
          <dcrpt:Except URI="#xpointer(id('tbs')/Secrets/*)" />
        </dsig:Transform>
      </dsig:Transforms>
      ...
    </dsig:Reference>
    ...
  </dsig:Signature>
</Document>

Execution of the decryption transform will proceed as follows:

The input to the transform, N, is a node set containing the ToBeSigned element and its children, less comments. The parameter to the transform, E, is a set containing the two exception URIs.
The first exception URI does not resolve in this document; the second resolves to the two children of the Secrets element; this is the exception node set X.
As a result, D₀ is a node set consisting of the two EncryptedData elements, d_part-1 and d_part-2. Each of these is decrypted, resulting in the following node sets for o_part-1 and o_part-2:

<Part number="1">
      <Data>...</Data>
      <xenc:EncryptedData Id="#secret-1" ... />
    </Part>

<Part number="2">
      <xenc:EncryptedData Id="#data-2" Type="&enc;Element" ... />
    </Part>

Note that part of the second node set has been super-encrypted.

After this decryption stage, two new EncryptedData have been revealed. However, the first matches a barename XPointer exception URI and so is not considered further; hence, D₁ contains just the EncryptedData element d_data-2. This is decrypted again, resulting in the following node set o_data-2:

<Data>...</Data>

No new EncryptedData are revealed, so D₂ is empty and processing falls through to canonicalization.
The canonicalization-with-replace operation canonicalizes the node set N; but, in place of any EncryptedData elements that were decrypted, it canonicalizes the replacement node set. Similarly, it also replaces any decrypted EncryptedData elements in the replacement node sets. Further, canonicalization of any replacement node sets is augmented such that xmlns="" is emitted on any apex elements that have a null prefix and namespace URI, as described in XMLENC §xx. The resulting canonicalized data are the following:

<Document>
  <ToBeSigned Id="tbs">
    <Part xmlns="" number="1">
      <Data>...</Data>
      <xenc:EncryptedData Id="#secret-1" ... />
    </Part>
    <Part xmlns="" number="2">
      <Data xmlns="">...</Data>
    </Part>
    <Secrets>
      <xenc:EncryptedData ... />
      <xenc:EncryptedData ... />
    </Secrets>
  </ToBeSigned>

This octet stream is then parsed and returned as the output of the transform.

1.5 Limitations

In XML mode, the octet stream resulting from canonicalization-with-replacement MUST be well-formed. Typically this will be characterized by a single-rooted input node set. Additionally, if this node set has, at its top level, an EncryptedData element, then this should correspond to an encrypted single-rooted node set. However, this need not be the case: After decryption, multiple top-level nodes may be well-formed if they consist of whitespace, comments, processing instructions and a single element. No special processing is required to test for this condition because non-well-formed data will result in a parsing error.

Super-encryption MAY cause problems if a super-encrypted EncryptedData uses same-document references, or if an exceptional super-encrypted EncryptedData is referenced by a non-barename XPointer. Superencryption of signed data when these conditions are met is NOT RECOMMENDED. However, applications MAY solve some of these super-encryption problems through the use of encryption properties that identify exceptional super-encrypted elements, how same-document references from the encrypted data should be resolved, and to which signature such encryption properties apply. However, details of such a solution are beyond the scope of this specification.

Full XPointer URIs (whether in exceptions or encrypted data) may fail to resolve if encryption results in a structural change to part of the document relied upon by the reference. For example, the URI #xpointer(/ToBeSigned/*[3]) will no longer function if the first two children of the ToBeSigned element are encrypted together. Care should be taken when employing such references in association with the decryption transform.

EncryptedKey text, etc.

Local Variables:
fill-column: 72
End: