RE: Problems with PASWA ... and an alternative from Burdett, David on 2003-05-14 (xml-dist-app@w3.org from May 2003)

From: Burdett, David <david.burdett@commerceone.com>
Date: Wed, 14 May 2003 14:52:09 -0700
To: "'Martin Gudgin'" <mgudgin@microsoft.com>, "XML Dist-App (E-mail)" <xml-dist-app@w3.org>
Message-ID: <C1E0143CD365A445A4417083BF6F42CC053D1AD6@C1plenaexm07.commerceone.com>
Gudge

Thanks for responding. Comments/questions inline below.

David

-----Original Message-----
From: Martin Gudgin [mailto:mgudgin@microsoft.com]
Sent: Wednesday, May 14, 2003 8:39 AM
To: Burdett, David; XML Dist-App (E-mail)
Subject: RE: Problems with PASWA ... and an alternative


 

> -----Original Message-----
> From: xml-dist-app-request@w3.org 
> [mailto:xml-dist-app-request@w3.org] On Behalf Of Burdett, David
> Sent: 08 May 2003 02:03
> To: XML Dist-App (E-mail)
> Cc: DocSOAP Developers (E-mail)
> Subject: Problems with PASWA ... and an alternative
> 
> I've been following the recent thread reviewing the Proposed 
> Infoset Addendum to SOAP Messages with interest particularly 
> the differences between:
> a) Treating attachments as if they were part of the XML 
> Infoset (which PASWA proposes), vs
> b) Treating attachments as first class citizens in their own right.
>  
> I can see benefits in both approaches in that efficiently 
> putting a large "blob" in an attachment in a way that is 
> transparent to the application can make the application 
> processing simpler.
>  
> Alternatively, the idea of an attachment where the 
> application is aware that is an attachment as a separate item 
> is equally valid, for example an application that is 
> processing an order that just happens to have terms and 
> conditions attached to it as a PDF.
>  
> I also have one concern (and also a question) over the way 
> PASWA works as described below ...
>  
> USING CID: IN XBINC:INCLUDE
> There's a catch 22 here as:
> 1. You can only create the XML when you know the cid values 
> to put in the XML

I don't understand why I need to know the values. I'm writing XML
elements along with their children; other elements, characters. At some
point I want to 'write' the PDF. I would just pass the PDF ( in stream
or byte array form ) to my writer and tell it "it's binary, do the right
thing'. If the writer I'm using is NOT PASWA aware, it will either give
up, or hopefully convert to base64. If it is PASWA aware, it will
serialize a xbinc:Include element ( and synthesise a value for the href
) and then serialize the PDF as raw-octets once it's done serializing
the SOAP envelope.
<DB>I think your approach only works if you are writing the content of the
Body and the rest of the SOAP message at the same time. If the XML had been
serialised earlier by some different software, then I don't think it works
without running the risk of having to alter the content of the original XML
which you might not be able to do if the original XML was digitally signed.
OK, you can get around this if the software writing the XML generates the
digital signature as well (which I admitted below), but you might not want
to always allow this.</DB>
 
> 2. You only know the cid values when you 
> marshall XML into the SOAP Message, but 
> 3. You can't marshall 
> the SOAP message until you have created the XML.
>  
> I know that you can get around this problem IF the generation 
> of the XML and the SOAP Message is done by the same software 
> at the same time. Although this will often be both possible 
> and desirable it is, I think, something that will often not 
> be possible to do.
>  
> Here's some use cases that explain why. They all assume an 
> XML document that uses an xbinc:include element that 
> references an attachment, 

I think the mismatch is in assuming that you would construct XML that
contained xbinc:Include elements in the first instance. I would never do
such a thing.
<DB>I now understand your idea better. You are suggesting that xbinc:Include
is an artifact of the serialization of the XML into the SOAP message rather
than something in the original XML. Question though, what would you do if
you wanted to send two documents at the same time and wanted one to refer to
the other and each had been created by separate software?</DB>

>e.g an XML order that references a 
> PDF document as described above:
> 1. The order and its attachment, is generated by an ERP 
> system and passed to a SOAP processor for forwarding to the 
> supplier. The SOAP processor puts the XML document into the 
> SOAP body The problem is the ERP system does not know 
> anything about the SOAP Message and therefore can't set the 
> href in the xbinc:include. So, the SOAP processor must alter 
> the XML to include it instead. This means that the SOAP 
> processor can no longer be a general purpose processor as it 
> must be payload aware.
> 2. This is a variation of 1 where the ERP system digitally 
> signs the XML it is generating. This means that the SOAP 
> processor can't even alter the original XML without breaking 
> the signature. The only solution is for the ERP system to 
> tell the SOAP Processor the cid values to use somehow. 
> However the ERP system may not have the functionality that 
> allows this.

You are assuming it is necessary to sign the @href of the xbinc:Include.
I would assert that it is NOT necessary to do that, rather you sign the
parent element and it's content ( whether as raw-octets or base64 is a
separate ( answerable ) question ).
<DB>This makes sense given my better understanding of how xbinc:Include
works.</DB>

> 3. The order and its attachments are sent to its destination. 
> The destination then archives the payload and attachments 
> discarding the original SOAP envelope. Some time later the 
> payload is removed from the archive and forwarded in another 
> SOAP message together with the attachments. The problem is 
> how does the SOAP processor that is doing the forwarding know 
> what to use for the content ids in the MIME message.

I don't understand the relationship between the 'payload' and the
attachment.
<DB>By payload I meant the order, or more generally the content of the SOAP
body. I guess that in this situation you would say that when you store the
content of the SOAP body, you replace the xbinc:Include by the data
referenced by the Include. That's OK but what if the data being referenced
was 100MB long. It could cause some practical implementation problems, or
could you use the xbinc:Include idea when storing the contents of a SOAP
body in a database, but this time you point to some other database location
instead of using a cid?</DB>

>  
> The point is that I don't think that a tight coupling between 
> the XML and the SOAP message will often work or be practical.

I think the XML looks just as it ever did ( and does NOT contain
xbinc:Include elements ). The SOAP layer MIGHT serialize using
xbinc:Include, but the application layer need never see them.
<DB>I agree, but this doesn't answer my question which is what do you do
when, for valid reasons, you want a separation between the software that
creates/manipulates the content of the SOAP Body and the software that
transports that content using SOAP.</DB>

>  
> HOW DOES PASWA WORK WITH WSDL
> This is really more of a question than a concern in that 
> PASWA ignores the idea of Message Parts that is one of the 
> fundamental concepts behind WSDL. What is not clear to me is 
> how you would decide to use separate message parts rather 
> than the transparent message parts that PASWA seems to suggest.

I think you just define messages with XSD types and anything that is of
type xsd:base64Binary is a potential attachment.
<DB>So can you see any benefit in treating attachments as "first class
citizens" - here's a use case. For deliveries of goods within the same
country, just a shipping note is required. However, for international
deliveries you will often need a customs declaration to go along with the
shipping note (ideally in the same message) and the customs declaration must
reference the shipping note.</DB>

>  
> AN ALTERNATIVE APROACH?
> Finally, Commerce One, has developed a soap header spec and 
> an open source, royalty free implementation (called DocSOAP) 
> that uses a Manifest element that is an extension of the 
> ideas on a Manifest from ebXML Messaging. It covers much (but 
> not all) of the same ground covered by PASWA whilst solving 
> (we think) the problems with the use of content id described 
> above as well as tying it in more closely to WSDL.

I'm not convinced there is a problem with cids.
<DB>There isn't if the content of the SOAP Body and the rest of the SOAP
message is handled by the same software. On the other hand, if they are
handled by separate software I think there is.</DB>

>  
> The key difference between the SOAP Manifest and PASWA is 
> that the XML document references the WSDL partname for an 
> attachment and then the Manifest element in the SOAP header 
> ties the partname to the content id of the actual part which 
> can be in the SOAP Body, in an attachment or even externally 
> to the message on the web. This means that the content ids 
> can be changed at any time and only the manifest element 
> needs to change.

With PASWA the content-ids can be changed at any time and only the
corresponding xbinc:Include/@href needs to be changed.
<DB>I think the ideas behind PASWA for handling large binary data (or just
any large data binary or not) are great and very useful when the all of the
SOAP message is being handled at the same time. However, I don't think it
works well when you need to handle/process the content separately from the
rest of the message or when you want to handle more than one document at the
same time. Thoughts?</DB>

Gudge
Received on Wednesday, 14 May 2003 17:52:16 UTC