Last modified: Wed Apr 19 08:54:50 2000
When transmitting data over the Internet, in most cases the standard encryption protocols such as IPSec and SSL are good enough for achieving confidentiality during the transmission. Secure mails such as Pretty Good Privacy (PGP) and S/MIME can be used for encrypting data even after the message is received and stored in a file system. These methods are to encrypt an XML document as a whole. However, there are situations in which certain parts of an XML document need to be encrypted and the rest should be in clear text. A few motivating examples are shown below.
NAME
, PHONE
,
OFFICE
, and so on, but only his/her manager can access the
SALARY
field.
We anticipate that the newly formed mailing list on XML encryption (xml-encryption@w3.org) will stimulate enough interests to make a standard for XML encryption. This document outlines our requirements for such a standard and the design principle of our current implementation.
Here is a (partial) list of requirements that we have identified from the motivating scenarios above.
We are still unsure the following DTD-related properties are real requirements or not.
Before encrypting an element, we must convert it (and its
subelements) into a byte sequence because most of the encryption
algorithms treat their input as a byte sequence. There may be
several methods to do this conversion, but we should be careful
not to lose any non-essential information during the encryption
and decryption processes. It is also desirable that recovery of
the original information is not dependent on the surrounding
context, such as character encoding of the document, the
DOCTYPE
declaration, and the use of namespace
prefixes. In realistic B2B scenarios, a document holding encrypted
elements may be passed through multiple parties before reaching
the final destination and the outer context may be changed or
modified during the process.
One way to preserve XML information set during a serialization operation in a context-independent way is to use Canonical XML [C14N]. Another possibility is to use Java Object Serialization to directly map a DOM tree into a byte array. The use of C14N may be less efficient because canonicalized XML documents tend to be long, especially when namespaces are used, but is more desirable because it is independent from the implementation language.
Once an element is converted into a byte array, the encryption and decryption processes can be done in ordinary ways. Various cryptographic algorithms (i.e., DES and RC4) and protocols (i.e., CBC and PKCS5Padding) for such process and those primitives should be used for the XML encryption, too.
Encrypted element is a binary data so a text encoding method such as Base64 must be used when inserting it into an XML document. In addition, it is necessary to include additional information for decryption such as encryption algorithm, encrypted session key (when an asymmetric key is used), and initialization vector. One desirable design property is to use the existing cryptographic message standards as much as possible. Designing a new cryptographic message format requires a careful study to avoid security flaws. The use of existing ones would alleviate the burden.
One of the most popular cryptographic message formats is the
one used in S/MIME (PKCS7/CMS). In S/MIME, the encryption is done
by packaging the cipher text and other related information in a
single MIME entity whose media type is
application/pkcs7-mime
. Since the result is a MIME
entity, it can be included in an XML document in one of the
standard ways to embed MIME objects, for example, the
Object
tag defined in XML Digital Signature [DSIG].
For example, suppose the "cardinfo" element in the following XML document is to be encrypted using an asymmetric key.
<invoice> <bookorder>...</bookorder> <payment>...</payment> <cardinfo> <name>Hiroshi Maruyama</name> <expiration>04/2001</expiration> <number>0123 4567 8901 2345</number> </cardinfo> </invoice> |
The "cardinfo" element is serialized using XML
Canonicalization, and packaged into the following MIME entity
(application/pkcs7-mime
requires that the plaintext is
always a MIME entity):
MIME-Version: 1.0 Content-Type: text/xml; charset=utf-8 Content-Transfer-Encoding: base64 <cardinfo> <name>Hiroshi Maruyama</name> <expiration>04/2001</expiration> <number>0123 4567 8901 2345</number> </cardinfo> |
This MIME entity is combined with other information such as encryption algorithm and is packaged into a PKCS#7 object. Finally, the PKCS#7 object is embedded into the corresponding part of the original XML document using a standard method of embedding a MIME entity into XML:
<invoice> <bookorder>...</bookorder> <payment>...</payment> <dsig:Object MimeType="application/pkcs7-mime" Encoding="base64"> DDAKBgNVBAsTA1RSTDEZMBcGA1UEAxMQSGlyb3NoaSBNYXJ1eWFtYTAeFw05OT EyMTcwMDM3MzRa Fw0wMDAzMTYwMDM3MzRaMEQxCzAJBgNVBAYTAkpQMQwwCgY DVQQKEwNJQk0xDDAKBgNV== </dsig:Object> </invoice> |
where an Object
element defined in the namespace of the XML
signature.
As you can see, the serialized element is first packaged into a
MIME type but this packaging is redundant because in our
application the plaintext is always known to be a canonicalized
XML fragment, which has always the content type
text/xml
. The reason why we need this extra layer of
packaging is simply that application/pkcs7-mime
requires the plain text be a MIME entity. Introducing a new media
type "application/pkcs7" that allows any octet sequence as the
plain text would simplify the situation.
Another limitation of the above solution is that this scheme
works for asymmetric key encryption only. As far as we understand
the "application/pkcs7-mime" media type permits only the
"enveloped-data" or the "signed-data" as the content type of a
PKCS#7 object, and therefore this scheme does not work for
symmetric key encryption, which requires the "encrypted-data"
content type. Unless application/pkcs7-mime
allows
"encrypted-data" as a PKCS#7 content type, we need a separate
syntax for symmetric key encryption. The introduction of a new
media type application/pkcs7
discussed above would solve this
problem as well if we allow all the PKCS7 content types including
"enveloped-data," "encrypted-data."
Alternatively, we can define a simple XML syntax for symmetric key encryption. For example, the first XML document is encrypted as follows:
<invoice> <bookorder>...</bookorder> <payment>...</payment> <xenc:Object Algorithm="DES" IV="k0xDDAKBgNV==" Encoding="base64"> DDAKBgNVBAsTA1RSTDEZMBcGA1UEAxMQSGlyb3NoaSBNYXJ1eWFtYTAeFw05OT EyMTcwMDM3MzRa Fw0wMDAzMTYwMDM3MzRaMEQxCzAJBgNVBAYTAkpQMQwwCgY DVQQKEwNJQk0xDDAKBgNV== </xenc:Object> </invoice> |
where the "Object" element should be defined in the namespace of the XML encryption. This is the way currently implemented in IBM's XML Security Suite [XSS4J].