RE: Encryption Subset Scenario from Dournaee, Blake on 2002-05-16 (xml-encryption@w3.org from May 2002)

From: Dournaee, Blake <bdournaee@rsasecurity.com>
Date: Thu, 16 May 2002 10:47:49 -0700
To: xml-encryption@w3.org
Cc: "Hammond, Ben" <bhammond@rsasecurity.com>
Message-ID: <E7B6CB80230AD31185AD0008C7EBC4D202A1BF98@exrsa01.rsa.com>
Ed, Jiandong.

Thank you for your comments and insight.

Ed, I agree with you on the transform issue. This is definitely one way
around the problem, but it is not part of the standard and hence, not
interoperable.

Jiandong - I don't think I understand your objection. I think the original
problem contained an (unstated) assumption that XML document subset is
continuous. I think I see what you meant - you were concerned about empty
text nodes inbetween perhaps? If we encrypt a continuous document subset -
is there a plaintext replacement problem upon encryption or decryption that
I'm not seeing?

I think I might be looking for a Type value that has "document subset"
semantics, or "Type='http://www.w3.org/20001/04/xmlenc#DocumentSubset". I am
of course just throwing this out there for the sake of argument at this
point. There may be good reasons *not* to have this.

Consider Document D below:

<doc>
	Mary
	<elem1> Had </elem1>
	<elem2> A little </elem2>
	<![CDATA[ lamb for dinner ]]>
	<!-- This is a SECRET comment -->	 
	<cleartext>
		This information should not be encrypted
	</cleartext>	
</doc>

Now, suppose I have a single encryption key E_k, that I want to use to
encrypt a document subset. Further, I want the same algorithm and parameters
for everything.

The document subset that I want to encrypt is everything but the element
<cleartext> and its contents. The normative (soon to be!) solution, using a
single encryption key, might look like this:

<doc>
      <EncryptedData
Type='http://www.w3.org/2001/04/xmlenc#Content'...>...</EncryptedData>
      <EncryptedData
Type='http://www.w3.org/2001/04/xmlenc#Element'...>...</EncryptedData>
	<EncryptedData
Type='http://www.w3.org/2001/04/xmlenc#Element'...>...</EncryptedData>
	<EncryptedData
Type='http://www.w3.org/2001/04/xmlenc#Content'...>...</EncryptedData>
	<EncryptedData
Type='http://www.w3.org/2001/04/xmlenc#Content'...>...</EncryptedData>
	<cleartext>
		This information should not be encrypted
	</cleartext>	
</doc>


I'm fairly sure the comment and CDATA sections get treated as type
"Content". In any case, this seems to be a less than optimal solution in
terms of wasted space because a good deal of redundant information is
included. I also believe that from an implementation point of view it will
cause uneeded overhead (yes, this is out of scope!), but I want to mention
it anyway :)

Consider what happens when the document subset to be encrypted grows to
hundreds or thousands of lines? The redundancy surely adds up as the scale
increases.

Why not have something like this instead?

<doc>
      <EncryptedData
Type='http://www.w3.org/2001/04/xmlenc#DocumentSubset'...>...</EncryptedData
>
      <cleartext>
		This information should not be encrypted
	</cleartext>	
</doc>


Not only is this more space efficient, but it preserves the semantics of
what is encrypted as an XML document subset instead of just plain text. This
argument really only holds water if it is *useful* to preserve the document
subset as an "XML document subset" and not just octets.

Ed's solution (an out-of-scope XPath transform or equivalent) works, but it
is not written into the standard. Also, as we're learning in XML Dsig, XPath
can be messy and slow... :)

The other solution is to force the plaintext creator to do something like
this:

<doc>
	<secureSection>
   	 Mary 
	 <elem1> Had </elem1>
	 <elem2> A little </elem2>
	 <![CDATA[ lamb for dinner ]]>
	 <!-- This is a SECRET comment -->	 
	</secureSection>
	<cleartext>
		This information should not be encrypted
	</cleartext>	
</doc>

This makes the whole problem go away, as we can now just use
Type="...#Content".   This, however, may not be practical in the real world
if someone has a poorly designed XML markup language (XML "Application" or
whatever you want to call it)

Can anyone articulate some good arguments against a DocumentSubset Type? Are
there processing rules or plaintext replacement issues that I am not seeing
here? Does this bias the XML Encryption spec towards a DOM tree based view ?
(don't think so, could be wrong though, an XML document subset is just a
subset, regardless of how the document is modeled).



Regards,


Blake Dournaee
Toolkit Applications Engineer
RSA Security
 
"The only thing I know is that I know nothing" - Socrates
 
 


-----Original Message-----
From: Ed Simon [mailto:edsimon@xmlsec.com]
Sent: Thursday, May 16, 2002 8:21 AM
To: Jiandong Guo; Dournaee, Blake
Cc: xml-encryption@w3.org; Hammond, Ben; edsimom@xmlsec.com
Subject: Re: Encryption Subset Scenario


If one needs to combine adjacent elements before encrypting so that one ends
up with a single <EncryptedData> element, then one can always use a
transform (eg. XSLT) just before encryption and just after decryption.  Note
that these transforms fall outside the core encryption and decryption
processes and are NOT related to the <Transforms> element in XML Encryption.

Note that the above technique doesn't require elements to be adjacent.  For
example, if one wanted to encrypt all <Age> and <Address> elements in a
document, regardless of their location, into one <EncryptedData> element,
one could do so.

Of course, the critical question is not "can I do this?" but "should I do
this?".   In other words, always ask if using a workaround to get a desired
result is really justified.  I would recommend initially assuming the answer
is NO until proven otherwise.

Ed

----- Original Message -----
From: "Jiandong Guo" <jguo@phaos.com>
To: "Dournaee, Blake" <bdournaee@rsasecurity.com>
Cc: <xml-encryption@w3.org>; "Hammond, Ben" <bhammond@rsasecurity.com>;
<edsimom@xmlsec.com>
Sent: Thursday, May 16, 2002 10:17 AM
Subject: Re: Encryption Subset Scenario


>
>
> "Dournaee, Blake" wrote:
>
> > All -
> >
> > Given an input Document D:
> >
> > <doc>
> >   <elem1> foo1 </elem1>
> >   <elem2> foo2 </elem2>
> >   <elem3> foo3 </elem3>
> > </doc>
> >
> > I want to encrypt just the first two child elements (<elem1> and
<elem2>).
> > This doesn't appear to fit the definition of
> > Type='http://www.w3.org/2001/04/xmlenc#Element', which suggests a single
> > element, or Type='http://www.w3.org/20001/04/xmlenc#Content'
> > which suggests that all three elements must be encrypted (elem1, elem2
and
> > elem3).
> >
> > Choosing to treat the first two elements as arbitrary plaintext also
seems
> > overkill, and if so, this ruins the XML semantics. I cannot
> > treat it as text/xml, because this document subset is not well-formed.
> > Treating it as text/plain looses all of the XML semantics.
> >
> > The obvious solution is to create two <EncryptedData> elements, but this
is
> > redundant. Another solution is an XPath transform, but this
> > doesn't exist for XML Encryption.
> >
> > Am I missing something here? Is there an obvious solution to this? It
seems
> > like a simple case that might have been overlooked.
>
> If you want to encrypt two elements in one EncryptedData, the question is
> that how do you handle the "replace" process in encryption and later in
> decryption,
> considering there could have other nodes (text nodes or other elements)
between
>
> these two elements?
>
> -----------------------------------------
> Jiandong Guo
> Phaos Technology
> www.phaos.com
>
>
>
>
>
Received on Thursday, 16 May 2002 13:47:55 UTC