- From: by way of Joseph Reagle <duerst@w3.org>
- Date: Mon, 3 Dec 2001 10:03:20 -0500
- To: xml-encryption@w3.org
Dear XML Encryption Editors and Working Group, I'm sending you some (late, sorry) last call comments on your documents. These are internationalization comments, but I'm currently sending them in as an individual. I expect the I18N WG will look at them at their upcomming teleconf on Tuesday, which might result in some tweaks, but probably not big changes. I'm adding some non-i18n points at the end; these are personal only and I don't expect the I18N WG to discuss them. The syntax/processing is basically right (in the sense that XML is serialized using UTF-8). However, there is no corresponding requirement for it, and there is none of the details, 'health warnings' and security warnings that we worked out for the XML Signature spec and that I would have expected to be reused. In detail: For the Requirements doc at: http://www.w3.org/TR/2001/WD-xml-encryption-req-20011018 - There should be a requirement that says that encryption should work (in the sense that you get the original stuff back after decription) under Infoset-preserving transformations of the XML that contains the encrypted pieces. [This makes sure that when encrypting XML, it has to be in a defined encoding (as it currently is).] For the Syntax/Processing doc at: http://www.w3.org/TR/2001/WD-xml-encryption-req-20011018 - There needs to be a requirement to use NFC when converting from a legacy encoding to UTF-8 when encrypting. This should be very much the same as in XML Signature, Section 6.5 (http://www.w3.org/TR/xmldsig-core/#sec-c14nAlg), last two paragraphs. There should also be something like the last paragraph before section 7.1 (http://www.w3.org/TR/xmldsig-core/#sec-XML-Canonicalization). - In section 4.2 Decryption, in step 4.3, the wording 'replace' ... 'by the UTF-8 encoded characters' may easily be misunderstood. After decryption, there will be a byte stream with characters encoded in UTF-8, but the replacement operation has to make sure that the appropriate character encoding conversion (transcoding) is applied. As an example, if the decrypted element or element content is inserted into a DOM, there has to be a conversion from UTF-8 to UTF-16. This should be made clear. - There needs to be some text about security risks associated with UTF-8. Assume that somebody knows that the encrypted text is Old Italic (http://www.unicode.org/charts/PDF/U10300.pdf, no spaces or punctuation). In this case, UTF-8 uses four bytes per characters, and three of them are always the same, and the top two (or three if there are no numbers) bits of the last byte are also always the same. There is probably some chance that this will make it easier to break the encryption. This should clearly be mentioned in the Security section, with maybe an advice that compression can help (but then it has to be possible to apply compression before encryption, might be good to have an example of this). I'm not an expert to assess the exact extent of this risk, so please use your expertize in this field. There are other, somewhat less extreme examples than old Italic, but the the point is the same. - URI -> anyURI/IRI: According to the Character Model, http://www.w3.org/TR/charmod/#sec-URIs, you have to make sure that wherever you use URIs, non-ASCII characters are allowed, and that conversion to ASCII only is done as late as possible. You already have this right in the Schema, by using anyURI, but you should make it clear in the text. - In 2.2.1, 'media type URI' is mentioned, but there is neither an explanation nor a reference. In addition, it would be good to check/explain that this can include parameters (such as charset). ========== non-i18n points from here down =================== Major point: - 2.1.5 forbids the encription of only part of EncryptedData or EncryptedKey. I don't see any particular reason for forbidding this, except to make some XML Schema issues easier. But I think it would be extremely valuable for the WG and the spec to do this exercise and to show how the Schema has to be changed to allow this. This is important because allowing encryption in places where it's not provided in some existing schema is something that applications using the spec will have to do a lot, and it's a good thing to work out (some of the) details. Even if this is not changed for EncryptedData or EncryptedKey, there should be an extended discussion of how to change a schema to work with encryption. Small points: - 'Bank of the Internet' should be changed to 'Example Bank' - In 2.1.4, change 'octet set' to 'octet sequence'. - I think using 'www.isi.edu' is rather outdated for iana uris. - Citing the obsolete RFC 1738 will confuse many people. - Reference XML-MT is obsoleted by RFC 3023. - In the 'Schema definition' at the start of 3., there are entities p and s defined, but they never get used. There is also a spurious &xenc; in 2.2.2. - In the first paragraph of 4.3, "that octets' semantics" isn't very clear. There seems to be a reference, but it's not clear to what. Octets as such don't really have any semanics anyway. - 5.2.1 and others: please change the space after "<EncryptionMethod" to a line break to increase the chance that the identifier is complete in printouts. Regards, Martin.
Received on Monday, 3 December 2001 10:03:22 UTC