Re: Decryption Transform processing question

Hi Takeshi,

It seems to me that there are three places where serialization
and parsing could occur, and two are bad. The okay one is
the final step, which is necessary if we assume node sets are
immutable, and probably a good idea regardless. The others are:

1) Serializing and Parsing each EncryptedData

As I tried to demonstrate in the quoted mail, it is possible
to use an XPath transform to take one EncryptedData and
to select a subset such that the node set represents an
entirely different EncryptedData. For a less contrived
example, consider selecting an EncryptedData element and
the guts of an EncryptedKey within it. While it would be
technically feasible to parse this disjointed EncryptedData
purely from the node set, I think in practice no (or few)
implementations can. Instead, they will probably require that
the node set representation of the EncryptedData be serialized
and then parsed.

If we do this then the EncryptedData cannot use XPointer to
refer to elements outside itself because they will be from
different documents. The URI #xpointer(/keys/Recipient)
cannot be evaluated.

I therefore think we should *not* require that the
EncryptedData node set subset be serialized and parsed. As a
side effect of this choice, I think we should not consider
the EncryptedData 'contained in' the node set, but the
EncryptedData 'identified by' the node set.  If any application
has the need to manufacture EncryptedData subsets then they
can explicitly insert a canonicalization transform and they
will get the effect they want, with the explicit knowledge that
XPointers will evaluate relative to the canonicalized/parsed
data.

2) Serializing, Wrapping and Parsing the entire Node Set X

Another issues is whether we serialize, wrap and parse the
entire node set X. Subissues are, do we do this before we start
processing and do we support multiple phases of decryption
with the s/w/p step in between. I've already agreed that
we should do this at the end.

2a) S/W/P initially

If we s/w/p initially, then we eliminate problem number 1.
However, we then lose the ability for any EncryptedData to
reference data that were outside the node set, and we have
the problem that XPointer references from within and without
the node set of the form /Foo/Bar will not work; instead,
they must be /dummy/Foo/Bar.

I also think that this makes our processing model really weird:
If the input is an octet stream, then it is parsed and Except
URIs are interpreted relative the the document root. That's
pretty normal. But, if the input is a node set then it is
swp'ed, and Except URIs are not interpreted relative to
the node set document, but instead to a new document that's
rooted by a dummy element. That, to me, is just bad. It is
ugly and unexpected.

I think we should process the input node set directly, with
the problems of 1) and the solution of 1). URIs into the node
set and out of the node set are interpreted in a completely
useful and consistent manner. The input to our transform is
a node set; the output is an octet stream or a node set from
a different document that is rooted by a dummy element.

Specifically, this also requires that we do all the
decryptions in a single phase; none of the iteration that
we do currently. The random iteration that we currently
specify produces entirely random effects in the presence of
same-document URIs.

2b) Multiple Encryption and S/W/P in between

This, to me, is just bad, and of no practical use. Phase one,
everything works as expected. Phase two, URIs are all now
interpreted relative to a completely different document with
a completely different structure.

Consider Except URI="#xpointer(/blah/EncryptedData)".
During phase one, this EncryptedData will be ignored.
During phase two, the URI will have to change to
"#xpointer(/dummy/blah/EncryptedData)"; so, the data will be
decrypted, regardless of the Except.

I think we should support a single phase of decryption during
which all non-excepted EncryptedData elements identified by
the input node set are decrypted. URIs will be interpreted
consistently, etc.

If someone multiple-encrypts an excepted EncryptedData;
e.g., if they just encrypt the entire input; then the outer
EncryptedData will be processed and the inner EncryptedData
will be left, so things will work as expected.

If someone multiple-encrypts non-excepted data, then they are
just asking for trouble. Solving that problem breaks too many
other things.


I think I've said my piece on this matter. If no one else has
an opinion then let us leave the transform as-is and note,
for the record, my opinion that it appears to be inconsistent
and non-deterministic and should be reformulated as [1] in
supersedence of what I may have said in the past.

Merlin

[1] http://lists.w3.org/Archives/Public/xml-encryption/2002May/0016.html

r/IMAMU@jp.ibm.com/2002.05.08/14:56:53
>
>Hi,
>
>>I'm not sure that it is an XPath WG question; it is an issue of
>>whether we interpret the input node set as identifying
>>EncryptedData elements or selecting EncryptedData structures,
>>and there are arguments in either direction.
>>
>>Though I'm loathe to give it, my specific example is:
>><Document>
>>  <SignedData>
>>    <enc:EncryptedData>
>>      ..#1..
>>      <enc:EncryptionProperties>
>>        <enc:EncryptionProperty>
>>          <Foo>
>>            <enc:EncryptedData>
>>              ..#2..
>>            </enc:EncryptedData>
>>          </Foo>
>>        </enc:EncryptionProperty>
>>      </enc:EncrtypionProperties>
>>    </enc:EncryptedData>
>>  </SignedData>
>></Document>
>>
>>Run through the XPath transform:
>>  self::enc:EncryptedData[not(ancestor::Foo)] or
>>  ancestor::Foo[not(self::enc:EncryptedData)]
>>(This may need tweaking for namespace nodes, etc.)
>>
>>The result is:
>>  <enc:EncryptedData>
>>    ..#2..
>>  </enc:EncryptedData>
>>
>>If we are to cater for cases like this, most implementations
>>will probably need to serialize and parse the node set.
>
>Maybe.  Actually, our current implementation does serialization and
>parsing.  But I think that it is possible to support this case without
>doing them.
>
>>Personally, I do not think that this is a use case worth
>>supporting. Requiring serialization rules out the following:
>><Document>
>>  <SignedData>
>>    <EncryptedData>
>>      <EncryptedKey>
>>        <RetrievalMethod URI="#recipient" />
>>      </EncryptedKey>
>>    </EncryptedData>
>>    <EncryptedData>
>>      <EncryptedKey>
>>        <RetrievalMethod URI="#recipient" />
>>      </EncryptedKey>
>>    </EncryptedData>
>>  </SignedData>
>>  ...
>>  <KeyInfo>
>>    <X509Data Id="recipient" />
>>  </KeyInfo>
>></Document>
>
>Whether serialization is required or not, considering the meaning of the
>same-document reference, the reference should be dereferenced over the
>original tree, I think.
>
>>I would argue that supporting the second type of document is
>>more important than the first, and we should therefore decrypt
>>the element e, without regard to what children are present in
>>the node set.
>
>I agree that supporting the second case is important, but sorry, I don't
>still understand why it is the rationale of decrypting e regardless of what
>children are present in the node-set.  Could you explain this more?
>
>>More importantly, if an application actually wants to apply
>>strange XPath filters, then they can perform an explicit c14n
>>step and get exactly the behaviour that they desire.
>
>Thanks,
>Takeshi IMAMURA
>Tokyo Research Laboratory
>IBM Research
>imamu@jp.ibm.com
>
>


-----------------------------------------------------------------------------
The information contained in this message is confidential and is intended
for the addressee(s) only.  If you have received this message in error or
there are any problems please notify the originator immediately.  The 
unauthorised use, disclosure, copying or alteration of this message is 
strictly forbidden. Baltimore Technologies plc will not be liable for
direct, special, indirect or consequential damages arising from alteration
of the contents of this message by a third party or as a result of any 
virus being passed on.

This footnote confirms that this email message has been swept for Content
Security threats, including computer viruses.
http://www.baltimore.com

Received on Wednesday, 8 May 2002 11:17:39 UTC