Re: [xml-dist-app] <none> from noah_mendelsohn@us.ibm.com on 2003-01-06 (www-ws-arch@w3.org from January 2003)

From: <noah_mendelsohn@us.ibm.com>
Date: Mon, 6 Jan 2003 18:09:15 -0500
To: Rich Salz <rsalz@datapower.com>
Cc: David Orchard <dorchard@bea.com>, "www-ws-arch@w3.org" <www-ws-arch@w3.org>, "xml-dist-app@w3.org" <xml-dist-app@w3.org>
Message-ID: <OF1DFDB815.3151A687-ON85256CA6.007CE437@lotus.com>
All comments in this response represent my personal opinion, not 
necessarily that of the XMLP WG:

Rich Salz writes:

> I would strongly encourage you to get at least one
> cryptographer actively involved in the discussion
> before this goes much further.  As a "short list" of
> contributors, I would recommend one of the authors (or
> original submittors) from the XMLDSIG or XENC
> documents.

Makes sense to me.

> On a more personal note, I am concerned about the "how
> can we make DSIG and XENC work with the infoset" tone.
> It's understandable, given the authors, but I want to
> emphasize that cryptography (at least as sed in the
> DSIG and XENC specs) depends on an octet stream --
> i.e., a serialization -- and anything other than that
> is a complete non- starter.

Rich and I have gone back and forth a bit on this, so I suspect that I am 
the one responsible for the "given the authors" reference.  I'd still like 
to believe that this is a simple misunderstanding, not a big problem, but 
of course I could be wrong.   I am concerned only with the input to the 
signature mechanism being an Infoset...if some c14n algorithm converts the 
parts to be signed to a stream as a first step before signing, that's not 
problem.  I >think< that the existing Canonical XML Version 1.0 
Recommendation [3] is a proof point for what I have in mind, though it 
happens to use the XPath Node Set rather than the Infoset as a starting 
point.  As described at [4]:

"The XPath node-set is converted into an octet stream, the canonical form, 
by generating the representative UCS characters for each node in the 
node-set in ascending document order, then encoding the result in UTF-8 
(without a leading byte order mark)."
 
So, the output of the c14n is indeed a stream, but the input is an 
abstract tree of nodes.  Perhaps ironically given our seeming 
disagreement, Rich's second email [2] seems completely consistent with 
this approach, as long as we agree that the input to the c14n is indeed a 
SOAP Infoset.

My only concern expressed in the draft note on behalf of XMLP is that 
Canonical XML 1.0 is starting with XPath data model, and SOAP is starting 
with Infoset.  The two are very close, and indeed the XPath document 
provides some information on the mapping (non-normatively, unfortunately) 
[5].   Still, I hope we'd agree that the levels of abstraction are 
sufficiently similar that if one can sign an XPath Data model node set, 
that very similar techniques could be used to sign an infoset.  I'm merely 
pointing out that we are causing unnecessary confusion by using the two 
slightly different models.

I confess to having less knowledge of XML Encryption [6], but I have heard 
annecdotally that it signs serialized XML, rather than XPath data models. 
If so, there might be bigger challenges in applying it to SOAP.  I think 
the natural way to build a SOAP processor feels more along the lines of: 
build up a SOAP envelope in a DOM (or example);  apply encryption to some 
of the element nodes in the DOM that represent certain SOAP headers or the 
body, replacing them with the constructs introduced in [6];  pass the 
resulting data structure to the implementation of some particular binding 
for transmission.  That binding then does the serialization (if any).  I 
don't see why, at least in principle, one could not have XML Encryption 
start with an Infoset, and use a (reversible) c14n to produce a stream 
that gets encrypted.  At the decrypting side, reverse the process: decrypt 
to produce the c14n stream and, if desired, parse to produce a DOM or 
other representation of the Infoset.

Other than the fact that XML Encryption and DSig have "gone to Rec", is 
there a signficant problem that I'm still missing?
 
> For completeness (and perhaps also to label myself
> Cassandra :), it should be mentioned that this issue
> was raised back [1] back in June, 2001, when the
> decision to "go Infoset" was first made, and in [2]
> February, 2002, I proposed a canonicalization solution.
> I lost the battle for #1 and #2 was fairly quickly
> ruled out of scope.
> 
>         /r$
> 
> [1] http://lists.w3.org/Archives/Public/xml-dist-app/2001Jun/0208.html
> [2] http://lists.w3.org/Archives/Public/xml-dist-app/2002Feb/0266.html

If I'm just being dense, I apologize, but I really think there are 
important use cases for both signatures and encryptions that are passed 
through intermediaries using different transports and serializations.  I 
think there is also value in being able to sign and/or encrypt (parts of) 
SOAP messages that are passed from sender to receiver in memory in the 
form of structures such as DOMs or SAX event streams.  Think of a message 
queuing system:  I can prepare the message using a DOM (for example), but 
its first stop is likely in a transacted database, not directly on the 
wire.  The application does not necessarily trust the MQ system, so the 
app. signs and/or encrypts parts of the message.  I >really< don't want to 
have to serialize the whole DOM to make this happen, especially since my 
MQ system may be implemented on a highly tuned XML database that supplies 
the DOM implementation directly.  I have no problem if, as part of the 
signing or encryption, the specific parts to be signed/encrypted are first 
serialized using a c14n.    Is the distinction clear, and does it resolve 
the concern?  Thank you for your patience with this.

Noah

[3] http://www.w3.org/TR/xml-c14n
[4] http://www.w3.org/TR/xml-c14n#ProcessingModel
[5] http://www.w3.org/TR/1999/REC-xpath-19991116#infoset
[6] http://www.w3.org/TR/xmlenc-core/

------------------------------------------------------------------
Noah Mendelsohn                              Voice: 1-617-693-4036
IBM Corporation                                Fax: 1-617-693-8676
One Rogers Street
Cambridge, MA 02142
------------------------------------------------------------------
Received on Monday, 6 January 2003 18:14:56 UTC