RE: octet-based processing model from Blair Dillaway on 2001-08-23 (xml-encryption@w3.org from August 2001)

From: Blair Dillaway <blaird@microsoft.com>
Date: Thu, 23 Aug 2001 15:32:21 -0700
To: "merlin" <merlin@baltimore.ie>, <xml-encryption@w3.org>
Message-ID: <AA19CFCE90F52E4B942B27D42349637902CDCD77@red-msg-01.redmond.corp.microsoft.com>
Merlin,

Here's my take on the issues you raise.

The complexity you object to is a result of the decision to support
automated decrypt-and-replace functionality for Element and Content.
Without enforcing a standard encoding and typing mechanism I don't see
how one can make this work.  It was suggested a long time ago that
encryption/decryption processing would be much simpler if this operation
were pushed to the application.  But, the WG concensus was that this
operation should be supported, so I think we need to have a processing
discussion along the lines in the lastest draft. You stated that you
didn't want decrypt-and-replace to be required in an earlier posting, so
perhaps you can provide additional rationale on why removal of this
feature is in order?

WRT to your implemenation, and my own prototype code for that matter.
In processing decrypted Elements and Content types, I assert that we
have simply made an implicit assumption about the encrypted data that is
now explicit in Section 4.  Namely, that the output of the decryption
transform is a representation of serialized XML, in a known character
encoding, suitable for parsing by existing XML tools (DOM, SAX, a stream
parser, ...).  If we remove any encoding assumption for Element and
Content, then I don't see how one can build a robust implementation for
dealing with it.  For example, what if the result of the decryption
operation were:
	- Shift-JIS encoded serialized XML carried inside a UTF-8
encoded document?  You won't know the data is Shift-JIS, so any parse
will result in garbage. 
	- an Infoset encoding using some attribute-value pair
convention? 
	- etc.

I also see value in defining at least one standard way to encode XML
(serialize in UTF-8) for encryption/decryption.  This will benefit a lot
of apps who will simply use it rather than try to spec there own method.

-----Original Message-----
From: merlin [mailto:merlin@baltimore.ie] 
Sent: Thursday, August 23, 2001 11:23 AM
To: xml-encryption@w3.org
Subject: octet-based processing model



Hi all,

I clearly failed to understand the octet-based processing model concept
when reading the encryption spec and this list, and I now see that there
were a few requests for comments. My (verbose) comment is that I
disagree with it.

Far from simplifying things, it adds (for me, obviously) unnecessary
confusion and complexity to what is otherwise a functional and
straightforward spec.

If we simply remove all reference to character encoding, streams of
data, the encoding of parent documents, etc. (other than the necessary
translation of XML content prior to data encipherment), the spec becomes
much clearer, and toolkits lose the bonds that prevent them from
implementng encryption how they want, using DOM, SAX, characters, etc.

I implemented the spec (obviously without reading it properly) and it
works elegantly and efficiently with a node-based processing model, and
it integrates seamlessly with signatures. I get pains even thinking
about tinkering with streams of bytes. That does not seem (to me) to be
a realistic *REQUIRED* model. If we leave the spec free of such
assumptions and requirements, many more implementations are possible:

1. octet-based system that messes around with characters
2. DOM-based system that works with nodes
3. SAX-based system that streams events
4. system that uses native class mappings of the schema
5. integral part of an XML parser chain
6...

I do not believe that 1 is the dominant case (I would suggest that 2, 4
and 5 are), and I do not believe that we should restrict ourselves to
it.

It seems to me that removing *any* assumption of model from the spec
reduces its size and complexity, but increases its power and has (to my
inflamed eyes) no shortcomings.

I'm probably missing the obvious; could someone please clarify?

Thanks, merlin


------------------------------------------------------------------------
-----
Baltimore Technologies plc will not be liable for direct,  special,
indirect 
or consequential  damages  arising  from  alteration of  the contents of
this message by a third party or as a result of any virus being passed
on.

In addition, certain Marketing collateral may be added from time to time
to promote Baltimore Technologies products, services, Global e-Security
or appearance at trade shows and conferences.

This footnote confirms that this email message has been swept by
Baltimore MIMEsweeper for Content Security threats, including computer
viruses.
   http://www.baltimore.com
Received on Thursday, 23 August 2001 18:32:54 UTC