RE: XML Schema and the necessity for canonical representations

Donald in his email says ...

	... On the other hand, if you have a transactional/protocol point of
view, where pieces of messages are being signed, data is processed and
forwarded by intermediate parties, and the signature verified by later
recipients, etc., canonicalization is essential ...

This method of working is an *absolute* requirement for the Internet Open
Trading Protocol being developed by the IETF Trade WG. IOTP needs digital
signatures and, in the next version of the IOTP specification, we want to
adopt the results of the dsig working group.

What we need to be able to do, as Don says, is take parts of a signed XML
document and use them to construct a new XML document with additional data
whilst still preserving the signature.

I thought it might help to illustrate the problem with a real example so the
rest of this email describes a small simplified version of some IOTP
messages and how they use and why they need signatures.

David

IOTP EXAMPLE
The following is a simplified example of the first two messages in an IOTP
purchase that illustrate how signatures are used. The messages are:
*	an XML document containing an offer from a Merchant to a Consumer
that describes the goods or sevices the Consumer wants to buy, how much to
pay, etc, and
*	an XML document, largely created from the first, that the Consumer
uses to initiate a payment.

FIRST MESSAGE
The first message is sent from a Merchant to a Consumer. It contains
elements that:
*	identify the overall transaction
*	identify the individual message (document) within the transaction
*	describe who is involved in the transaction including the consumer,
the merchant and who is to accept the payment (the payment handler)
*	describe details of the merchant's offer including:
	*	what is being bought (the order)
	*	how much to pay (the payment) and
	*	how the goods will be delivered
*	digitally sign all the above information for non-repudiatability and
to make sure it can't later be changed.

In XML this looks like ...

<?XML Version='1.0'?>
<!DOCTYPE IotpMessage >
<IotpMessage>
  <TransRefBlk>
>     <TransId ...>
      ... data to uniquely identify and 
          describe the set of related messages 
          in a transaction ...
>     <MsgId ...>
    ... data to describe and uniquely identify 
        the message (document) within a transaction
>     </MsgId>
>     </TransId>
>   </TransRefBlk>
  <Signature ...>
   ... digital signature of digests of TransId, MsgId, Org, Order, Payment &
Delivery elements
  </Signature>
  <Org ...>
    ... data to describe the merchant
  </Org>
  <Org ...>
    ... data to describe the consumer
  </Org>
  <Org ...>
    ... data to describe the payment handler
  </Org>
  <OfferRespBlk>
    <Order ...>
    ... description of what is being bought ...
    </Order>
    <Payment ...>
    ... description of how much to pay
    </Payment>
    <Delivery ...>
    ... description of how the goods will
        be delivered
    </Delivery>
  </OfferRespBlk>
</IotpMessage>

SECOND MESSAGE
Once the consumer receives the first message and checks it's OK, then the
consumer starts the payment by creating a new "Payment Request" message
(document) that contains elements that:
*	identify the overall transaction, copied from the first message
received from the Merchant
*	identify the individual message (document) within the transaction -
New, generated by the Consumer
*	describe the merchant and who is to accept the payment (the payment
handler), copied from the first message
*	describe how much to pay (the payment), copied from the first
message
*	describe the payment instrument (e.g. the credit card) that the
consumer wants to use. New.
*	prove the validity of the merchant generated information, by copying
the signature from the first message

Note that information on the Consumer, what is being bought and how delivery
is to occur is *deliberately* omitted for privacy reasons. The payment
handler doesn't need to know this information to accept a payment.

This then results in an XML document that looks like ...

<?XML Version='1.0'?>
<!DOCTYPE IotpMessage >
<IotpMessage>
  <TransRefBlk>
>     <TransId ...>
      ... data to uniquely identify and 
          describe the set of related messages 
          in a transaction. Copied from the 1st message
>     <MsgId ...>
    ... data to describe and uniquely identify 
        the message (document) within a transaction. This is new.
>     </MsgId>
>     </TransId>
>   </TransRefBlk>
  <Signature ...>
   ... digital signature. Copied from the 1st message 
  </Signature>
  <Org ...>
    ... data to describe the merchant. Copied from the 1st message
  </Org>
  <Org ...>
    ... data to describe the payment handler. Copied from the 1st message
  </Org>
  <PayReqBlk>
    <Payment ...>
    ... description of how much to pay. Copied from the 1st message
    </Payment>
    <PaySchemeData>
    ... details on the payment instrument to be used. New. Created by the
Consumer
    </PaySchemeData>
  </OfferRespBlk>
</IotpMessage>

Once the Payment Handler receives the Payment Request message they can check
the signature to make sure that information on who the merchant and payment
handler are, and how much to pay hasn't changed.

Later messages use the same approach with data for a new message being
created out of data from earlier messages.

One thing to note is that the Digital Signature can potentially sign any
element in any message (document) in the same transaction.


> ----------
> From: 	dee3@us.ibm.com[SMTP:dee3@us.ibm.com]
> Sent: 	21 May 1999 13:50
> To: 	www-xml-schema-comments@w3.org; XML-DSig Workshop
> Subject: 	XML Schema and the necessity for canonical representations
> 
> Having a canonical form of an entity is very important for comparison and
> digital signature purposes.
> 
> XML is sufficiently rich that canonicalization needs to be considered at
> several
> levels.  For example, the character set used in two XML documents needs to
> be
> converted to a standard if they are to be usefully compared for many
> purposes.
> There are also canonicalization considerations related to white space,
> namespace
> prefixes, etc, which are being considered by the XML Syntax WG.
> Similarly, I
> believe that canonicalization of datatype representation must be
> considered and
> the schema WG seems like the place to do it.
> 
> I think the need for datatype's to have a designated canonical lexical
> form
> should be fairly clear for comparison purposes.  It relieves the
> comparitor from
> the burden of having to be able to parse every form of every datatype and
> covert
> it to a canonical form the comparitor has selected.
> 
> The need may not be as immediately obvious in the digital signature arena,
> depending on your mental picture of the "typical" digital signature
> application.
> If you picture is very document/object oriented, you might wonder what all
> the
> fuss is about since any lump of bits can be signed and, if faithfully
> transmitted, this signature can be verified later on the same lump of
> bits.  On
> the other hand, if you have a transactional/protocol point of view, where
> pieces
> of messages are being signed, data is processed and forwarded by
> intermediate
> parties, and the signature verified by later recipients, etc.,
> canonicalization
> is essential.
> 
> I have been involved with too many systems where people thought that all
> they
> were doing was verifying signatures on unchanged data being sent through
> multi-party but faithful transmission channels only to find that there was
> some
> circumstance where a signed object had to be partly or fully
> re-constituted or
> some transmission channel was not as faithful as they thought.  As a
> result,
> some incredibly stupid thing like capitalization, padding, line ending
> character
> sequences, etc., etc., at least temporarily derailed their entire effort
> as, on
> a crash basis, they designed and painfully retrofitted canonicalization
> into
> their system.  Also witness the diddly little lack of canonicalization in
> the
> original ASN.1 time and date format: As soon as there was substantial real
> world
> use of this, a new, almost identical, fundamental data type, had to be
> added to
> ASN.1, with significant disruption and confusion, just to squeeze out the
> last
> case of alternative representations of the same date and time.
> 
> There is no problem with the Schema Datatypes document providing multiple
> lexical representations as long as exactly one form is designated as the
> canonical form.
> 
> I believe that the XML Schema Datatypes document should be changed to do
> this
> and perhaps this should be added to the XML Schema requirements document.
> 
> Thanks,
> Donald
> 
> Donald E. Eastlake, 3rd
> 17 Skyline Drive, Hawthorne, NY 10532 USA
> dee3@us.ibm.com   tel: 1-914-784-7913, fax: 1-914-784-3833
> 
> home: 65 Shindegan Hill Road, RR#1, Carmel, NY 10512 USA
> dee3@torque.pothole.com   tel: 1-914-276-2668
> 
> 

**********************************************************************************************

This Email and any attached files are confidential and may also be privileged. 
If you are not the intended recipient, please notify the postmaster using email 
address postmaster@mondex.com or call +44 171 557 5000 and ask for the 
IT Helpdesk.  You should not copy this email and any attached files, use them 
for any purpose or disclose the contents to any other person; all copies of the 
Email and associated files in your possession should be destroyed.

Mondex International Limited
47-53 Cannon Street
London EC4M 5SQ
United Kingdom
Registered No: 3122085, England

Phone:          +44 171 557 5000
Fax:            +44 171 557 5200
Email:          postmaster@mondex.com
WebSite:        http://www.mondexinternational.com

*********************************************************************************************

Received on Wednesday, 26 May 1999 15:46:15 UTC