- From: Joseph Ashwood <jashwood@arcot.com>
- Date: Thu, 21 Dec 2000 15:04:00 -0800
- To: <xml-encryption@w3.org>
[comments are post quote] ----- Original Message ----- From: "Martin J. Duerst" <duerst@w3.org> To: <xml-encryption@w3.org> Cc: <w3c-i18n-ig@w3.org> Sent: Wednesday, December 20, 2000 1:39 AM Subject: Fwd: Re: schema validity in encryption > >>This would change > >> <element>Clear text here.</element> > >>to > >> <element>ScRaMbLeD TeXt HeRe</element> > >>yes? While this may work technically (it will validate), I have > >>serious problems with such an approach. The markup is now actually > >>completely wrong. What was an <element> is still called an <element>, > >>but it's not an <element> anymore, it's an <encodedElement>. The > >>original Markup has been misused. This can be seen as a problem > >>of markup philosophy (or whatever you call it) but can also lead > >>to very serious practical problems. If the document is received > >>as is, and by accident or whatever the separate information in > >>an external document is lost (very easy to happen), the encoded > >>information will be taken as the real information, with very > >>bad consequences. I'd like to continue this a bit. I personally see no reason to rename an element simply because it is encrypted. In every context I can personally think of, specifically all those that are well formed for security (signed before encryption, or authenticated before encryption w/ authenticated keys), I can see no reason to change an element name based on whether or not it is encrypted. It is obvious from context to the point where if it is dictated that before the tag <whatever> is accessed it must be decrypted, there is no ambiguity. Additionally I am particularly fond of using tags that add as little space as possible to the document, so I would prefer <CCNum> to <EncryptedCCNum>. Since I generally work with files that have an absurd number of same-tagged entries (similar to a credit card list billing list of several million people) the savings can be significant in size. This is also my reason for wanting a semi-enforced tag->encryption mapping, that seperates the 2. I've noticed that the proposals seem to keep coming back to putting the decryption information as close to the encrypted data as possible. For small files, or files with diverse tags that are used a small number of times, this is the most reasonable mode of operation. However looking to the amount of processing power that must go into public key operations to exchange those keys securely, it gets extremely expensive. For the sake of information about this, please take a look at http://www.eskimo.com/~weidai/benchmarks.html for timing examples. Just as an example let's assume a Credit Card company with 500 Million customers who need to be billed each month, with spending tracked etc (assume Visa for the sake of argument). It seems reasonable for them to want to seperate the cardholder name, card number, and billing address, with each of the 3 encrypted to a seperate key. It is also reasonable that they will have a large number of employees who will have access to varying subsets of those fields, say 6000 employees. Assuming the decryption data is housed directly with the encrypted data. Also assuming that 3000 Employees have access to each of the fields (there is significant overlap). And assuming the use of 2048-bit RSA keys. Adding 1 credit card entry will require 9000 public key operations, which will completely dominate any other factors involved. This means .89*9000 milliseconds, or about 8 seconds. For billing it gets much worse, to bill each month requires 500 Million decryptions * 3 fields, 64.13 *1,500,000,000 milliseconds, about 1000 days, even if only the the credit card number is encrypted that 300+ days. The storage factor is just as dominant, assuming there is no other information, each record will occupy 2048*3*3000 bits of disk, or 500,000,000*2048*3*3000 bits total, which is ~1 million Gigabytes. I personally consider these demands to be far beyond what should be required of any company, and I don't want to force them to use a custom in house design which they will have to create, test, and support in house or hire out. If the mapping is done such that each of those 6000 employees is given selective knowledge of Rijndael encryption keys/ Otherwise same asusmptions. Adding 1 credit card takes, 3 public key operations (to recover the encryption keys), 3 keyings of Rijndael, and 3-9 block encryptions with Rijndael, that's 192 milliseconds + very small amounts, or about a tenth of a second. For billing there is 3 public key operations, 3 Rijndael keyings and 1500-4500 million bits of Rijndael decryptions, that's a few seconds. The space is much more reasonable, assuming there is no other information it is approximately 3000*3*2048 bits for the encrypted decryption keys (at most, and there is likely to be ways to optimize that), plus 3 blocks of Rijndael per field, or 384 bits perfield, times 3 fields per card, times 500 million cards, is a mere 68 gigabytes. This way the computation is likely to be fairly dominated by the disk/database access that is involved in such operations. It is for this reason that I am personally strongly for seperating the keying data from the encrypted data. I am sure I am not the only person who will be dealing with this kind of situation, and these are not small differences in compute or storage area. Joe
Received on Thursday, 21 December 2000 18:30:13 UTC