- From: merlin <merlin@baltimore.ie>
- Date: Wed, 08 May 2002 16:17:32 +0100
- To: "Takeshi Imamura" <IMAMU@jp.ibm.com>
- Cc: xml-encryption@w3.org
Hi Takeshi, It seems to me that there are three places where serialization and parsing could occur, and two are bad. The okay one is the final step, which is necessary if we assume node sets are immutable, and probably a good idea regardless. The others are: 1) Serializing and Parsing each EncryptedData As I tried to demonstrate in the quoted mail, it is possible to use an XPath transform to take one EncryptedData and to select a subset such that the node set represents an entirely different EncryptedData. For a less contrived example, consider selecting an EncryptedData element and the guts of an EncryptedKey within it. While it would be technically feasible to parse this disjointed EncryptedData purely from the node set, I think in practice no (or few) implementations can. Instead, they will probably require that the node set representation of the EncryptedData be serialized and then parsed. If we do this then the EncryptedData cannot use XPointer to refer to elements outside itself because they will be from different documents. The URI #xpointer(/keys/Recipient) cannot be evaluated. I therefore think we should *not* require that the EncryptedData node set subset be serialized and parsed. As a side effect of this choice, I think we should not consider the EncryptedData 'contained in' the node set, but the EncryptedData 'identified by' the node set. If any application has the need to manufacture EncryptedData subsets then they can explicitly insert a canonicalization transform and they will get the effect they want, with the explicit knowledge that XPointers will evaluate relative to the canonicalized/parsed data. 2) Serializing, Wrapping and Parsing the entire Node Set X Another issues is whether we serialize, wrap and parse the entire node set X. Subissues are, do we do this before we start processing and do we support multiple phases of decryption with the s/w/p step in between. I've already agreed that we should do this at the end. 2a) S/W/P initially If we s/w/p initially, then we eliminate problem number 1. However, we then lose the ability for any EncryptedData to reference data that were outside the node set, and we have the problem that XPointer references from within and without the node set of the form /Foo/Bar will not work; instead, they must be /dummy/Foo/Bar. I also think that this makes our processing model really weird: If the input is an octet stream, then it is parsed and Except URIs are interpreted relative the the document root. That's pretty normal. But, if the input is a node set then it is swp'ed, and Except URIs are not interpreted relative to the node set document, but instead to a new document that's rooted by a dummy element. That, to me, is just bad. It is ugly and unexpected. I think we should process the input node set directly, with the problems of 1) and the solution of 1). URIs into the node set and out of the node set are interpreted in a completely useful and consistent manner. The input to our transform is a node set; the output is an octet stream or a node set from a different document that is rooted by a dummy element. Specifically, this also requires that we do all the decryptions in a single phase; none of the iteration that we do currently. The random iteration that we currently specify produces entirely random effects in the presence of same-document URIs. 2b) Multiple Encryption and S/W/P in between This, to me, is just bad, and of no practical use. Phase one, everything works as expected. Phase two, URIs are all now interpreted relative to a completely different document with a completely different structure. Consider Except URI="#xpointer(/blah/EncryptedData)". During phase one, this EncryptedData will be ignored. During phase two, the URI will have to change to "#xpointer(/dummy/blah/EncryptedData)"; so, the data will be decrypted, regardless of the Except. I think we should support a single phase of decryption during which all non-excepted EncryptedData elements identified by the input node set are decrypted. URIs will be interpreted consistently, etc. If someone multiple-encrypts an excepted EncryptedData; e.g., if they just encrypt the entire input; then the outer EncryptedData will be processed and the inner EncryptedData will be left, so things will work as expected. If someone multiple-encrypts non-excepted data, then they are just asking for trouble. Solving that problem breaks too many other things. I think I've said my piece on this matter. If no one else has an opinion then let us leave the transform as-is and note, for the record, my opinion that it appears to be inconsistent and non-deterministic and should be reformulated as [1] in supersedence of what I may have said in the past. Merlin [1] http://lists.w3.org/Archives/Public/xml-encryption/2002May/0016.html r/IMAMU@jp.ibm.com/2002.05.08/14:56:53 > >Hi, > >>I'm not sure that it is an XPath WG question; it is an issue of >>whether we interpret the input node set as identifying >>EncryptedData elements or selecting EncryptedData structures, >>and there are arguments in either direction. >> >>Though I'm loathe to give it, my specific example is: >><Document> >> <SignedData> >> <enc:EncryptedData> >> ..#1.. >> <enc:EncryptionProperties> >> <enc:EncryptionProperty> >> <Foo> >> <enc:EncryptedData> >> ..#2.. >> </enc:EncryptedData> >> </Foo> >> </enc:EncryptionProperty> >> </enc:EncrtypionProperties> >> </enc:EncryptedData> >> </SignedData> >></Document> >> >>Run through the XPath transform: >> self::enc:EncryptedData[not(ancestor::Foo)] or >> ancestor::Foo[not(self::enc:EncryptedData)] >>(This may need tweaking for namespace nodes, etc.) >> >>The result is: >> <enc:EncryptedData> >> ..#2.. >> </enc:EncryptedData> >> >>If we are to cater for cases like this, most implementations >>will probably need to serialize and parse the node set. > >Maybe. Actually, our current implementation does serialization and >parsing. But I think that it is possible to support this case without >doing them. > >>Personally, I do not think that this is a use case worth >>supporting. Requiring serialization rules out the following: >><Document> >> <SignedData> >> <EncryptedData> >> <EncryptedKey> >> <RetrievalMethod URI="#recipient" /> >> </EncryptedKey> >> </EncryptedData> >> <EncryptedData> >> <EncryptedKey> >> <RetrievalMethod URI="#recipient" /> >> </EncryptedKey> >> </EncryptedData> >> </SignedData> >> ... >> <KeyInfo> >> <X509Data Id="recipient" /> >> </KeyInfo> >></Document> > >Whether serialization is required or not, considering the meaning of the >same-document reference, the reference should be dereferenced over the >original tree, I think. > >>I would argue that supporting the second type of document is >>more important than the first, and we should therefore decrypt >>the element e, without regard to what children are present in >>the node set. > >I agree that supporting the second case is important, but sorry, I don't >still understand why it is the rationale of decrypting e regardless of what >children are present in the node-set. Could you explain this more? > >>More importantly, if an application actually wants to apply >>strange XPath filters, then they can perform an explicit c14n >>step and get exactly the behaviour that they desire. > >Thanks, >Takeshi IMAMURA >Tokyo Research Laboratory >IBM Research >imamu@jp.ibm.com > > ----------------------------------------------------------------------------- The information contained in this message is confidential and is intended for the addressee(s) only. If you have received this message in error or there are any problems please notify the originator immediately. The unauthorised use, disclosure, copying or alteration of this message is strictly forbidden. Baltimore Technologies plc will not be liable for direct, special, indirect or consequential damages arising from alteration of the contents of this message by a third party or as a result of any virus being passed on. This footnote confirms that this email message has been swept for Content Security threats, including computer viruses. http://www.baltimore.com
Received on Wednesday, 8 May 2002 11:17:39 UTC