W3C home > Mailing lists > Public > xml-dist-app@w3.org > May 2003

RE: Problems with PASWA ... and an alternative (long)

From: Jacek Kopecky <jacek@systinet.com>
Date: 19 May 2003 10:58:56 +0200
To: "Burdett, David" <david.burdett@commerceone.com>
Cc: "'Martin Gudgin'" <mgudgin@microsoft.com>, "XML Dist-App (E-mail)" <xml-dist-app@w3.org>
Message-Id: <1053334735.22997.46.camel@localhost>


> <DB>I now understand your idea better. You are suggesting that xbinc:Include
> is an artifact of the serialization of the XML into the SOAP message rather
> than something in the original XML. Question though, what would you do if
> you wanted to send two documents at the same time and wanted one to refer to
> the other and each had been created by separate software?</DB>

you seem to have in mind the scenario where an external application (or
two) creates two different documents which need to refer to each other
and you want to send this using SOAP.

So first, even without SOAP, you need to solve somehow the matter of the
references. With the customs declaration referencing a shipping note,
I'd envision a way of uniquely identifying shipping notes (e.g. issuer
identification and a unique shipping note number). So the customs
declaration has as a part of it the reference to a shipping note. 

Now we want to transfer both using SOAP. One solution:

  <ns1:ShippingNote issuer="MyCompany" id="1234">
    <ns2:shipping issuer="MyCompany" id="1234"/>

That's the straightforward approach without binary inclusions. If the
note or the declaration were binary data *and this information was
somehow communicated to the SOAP software*, it could look like

    <xbinc:Include href="cid:uuid:9049034-3409-9304-09349043"/>
    <xbinc:Include href="cid:uuid:4930303-0439-3094-49320289"/>
<!-- the data somewhere in other MIME parts -->

Note that the references (hrefs) have nothing to do with the issuer and
id of the shipping note or any other kind of ID for the declaration. The
declaration still uses the same kind of reference (issuer and id) and
not the (now available) href UUID.

If you want to split the data-producing sw from the SOAP infrastructure
sw, the API in between can look in these two ways:

a) sendSOAPMessage(Element[] headers, Element[] bodyChildren)
in this case the SOAP infrastructure may serialize some of the children
of the body elements as PASWA binary attachments using CID URIs like the
UUIDs above

b) sendSOAPMessage(Element[] headers, Element[] bodyChildren, Attachment[] attachments)
in which case we have two ways again:

b1) the Attachment type has an identifier URI and the caller application
can set it and use it to reference the attachment; in this case I think
swa:Representation should be used to hold the attachments (and
xbinc:Include might be used to optimize the binary-in-XML overheads), or

b2) the Attachment type does not have an identifier and application
references are handled by the applications (like the issuer and ID
referencing above); this case would not benefit from the PASWA proposal,
it is the model used by SOAP with Attachments without Content-location.

The difference between (a) and (b1) shows the two "other" thing - web
references - that PASWA solves explicitly in XML and independently from
the binary inclusion mechanism and that SwA solved using the
Content-location header tied to the MIME representation of attachments.

Best regards,

                   Jacek Kopecky

                   Senior Architect
                   Systinet Corporation

On Wed, 2003-05-14 at 23:52, Burdett, David wrote:
> Gudge
> Thanks for responding. Comments/questions inline below.
> David
> -----Original Message-----
> From: Martin Gudgin [mailto:mgudgin@microsoft.com]
> Sent: Wednesday, May 14, 2003 8:39 AM
> To: Burdett, David; XML Dist-App (E-mail)
> Subject: RE: Problems with PASWA ... and an alternative
> > -----Original Message-----
> > From: xml-dist-app-request@w3.org 
> > [mailto:xml-dist-app-request@w3.org] On Behalf Of Burdett, David
> > Sent: 08 May 2003 02:03
> > To: XML Dist-App (E-mail)
> > Cc: DocSOAP Developers (E-mail)
> > Subject: Problems with PASWA ... and an alternative
> > 
> > I've been following the recent thread reviewing the Proposed 
> > Infoset Addendum to SOAP Messages with interest particularly 
> > the differences between:
> > a) Treating attachments as if they were part of the XML 
> > Infoset (which PASWA proposes), vs
> > b) Treating attachments as first class citizens in their own right.
> >  
> > I can see benefits in both approaches in that efficiently 
> > putting a large "blob" in an attachment in a way that is 
> > transparent to the application can make the application 
> > processing simpler.
> >  
> > Alternatively, the idea of an attachment where the 
> > application is aware that is an attachment as a separate item 
> > is equally valid, for example an application that is 
> > processing an order that just happens to have terms and 
> > conditions attached to it as a PDF.
> >  
> > I also have one concern (and also a question) over the way 
> > PASWA works as described below ...
> >  
> > There's a catch 22 here as:
> > 1. You can only create the XML when you know the cid values 
> > to put in the XML
> I don't understand why I need to know the values. I'm writing XML
> elements along with their children; other elements, characters. At some
> point I want to 'write' the PDF. I would just pass the PDF ( in stream
> or byte array form ) to my writer and tell it "it's binary, do the right
> thing'. If the writer I'm using is NOT PASWA aware, it will either give
> up, or hopefully convert to base64. If it is PASWA aware, it will
> serialize a xbinc:Include element ( and synthesise a value for the href
> ) and then serialize the PDF as raw-octets once it's done serializing
> the SOAP envelope.
> <DB>I think your approach only works if you are writing the content of the
> Body and the rest of the SOAP message at the same time. If the XML had been
> serialised earlier by some different software, then I don't think it works
> without running the risk of having to alter the content of the original XML
> which you might not be able to do if the original XML was digitally signed.
> OK, you can get around this if the software writing the XML generates the
> digital signature as well (which I admitted below), but you might not want
> to always allow this.</DB>
> > 2. You only know the cid values when you 
> > marshall XML into the SOAP Message, but 
> > 3. You can't marshall 
> > the SOAP message until you have created the XML.
> >  
> > I know that you can get around this problem IF the generation 
> > of the XML and the SOAP Message is done by the same software 
> > at the same time. Although this will often be both possible 
> > and desirable it is, I think, something that will often not 
> > be possible to do.
> >  
> > Here's some use cases that explain why. They all assume an 
> > XML document that uses an xbinc:include element that 
> > references an attachment, 
> I think the mismatch is in assuming that you would construct XML that
> contained xbinc:Include elements in the first instance. I would never do
> such a thing.
> <DB>I now understand your idea better. You are suggesting that xbinc:Include
> is an artifact of the serialization of the XML into the SOAP message rather
> than something in the original XML. Question though, what would you do if
> you wanted to send two documents at the same time and wanted one to refer to
> the other and each had been created by separate software?</DB>
> >e.g an XML order that references a 
> > PDF document as described above:
> > 1. The order and its attachment, is generated by an ERP 
> > system and passed to a SOAP processor for forwarding to the 
> > supplier. The SOAP processor puts the XML document into the 
> > SOAP body The problem is the ERP system does not know 
> > anything about the SOAP Message and therefore can't set the 
> > href in the xbinc:include. So, the SOAP processor must alter 
> > the XML to include it instead. This means that the SOAP 
> > processor can no longer be a general purpose processor as it 
> > must be payload aware.
> > 2. This is a variation of 1 where the ERP system digitally 
> > signs the XML it is generating. This means that the SOAP 
> > processor can't even alter the original XML without breaking 
> > the signature. The only solution is for the ERP system to 
> > tell the SOAP Processor the cid values to use somehow. 
> > However the ERP system may not have the functionality that 
> > allows this.
> You are assuming it is necessary to sign the @href of the xbinc:Include.
> I would assert that it is NOT necessary to do that, rather you sign the
> parent element and it's content ( whether as raw-octets or base64 is a
> separate ( answerable ) question ).
> <DB>This makes sense given my better understanding of how xbinc:Include
> works.</DB>
> > 3. The order and its attachments are sent to its destination. 
> > The destination then archives the payload and attachments 
> > discarding the original SOAP envelope. Some time later the 
> > payload is removed from the archive and forwarded in another 
> > SOAP message together with the attachments. The problem is 
> > how does the SOAP processor that is doing the forwarding know 
> > what to use for the content ids in the MIME message.
> I don't understand the relationship between the 'payload' and the
> attachment.
> <DB>By payload I meant the order, or more generally the content of the SOAP
> body. I guess that in this situation you would say that when you store the
> content of the SOAP body, you replace the xbinc:Include by the data
> referenced by the Include. That's OK but what if the data being referenced
> was 100MB long. It could cause some practical implementation problems, or
> could you use the xbinc:Include idea when storing the contents of a SOAP
> body in a database, but this time you point to some other database location
> instead of using a cid?</DB>
> >  
> > The point is that I don't think that a tight coupling between 
> > the XML and the SOAP message will often work or be practical.
> I think the XML looks just as it ever did ( and does NOT contain
> xbinc:Include elements ). The SOAP layer MIGHT serialize using
> xbinc:Include, but the application layer need never see them.
> <DB>I agree, but this doesn't answer my question which is what do you do
> when, for valid reasons, you want a separation between the software that
> creates/manipulates the content of the SOAP Body and the software that
> transports that content using SOAP.</DB>
> >  
> > This is really more of a question than a concern in that 
> > PASWA ignores the idea of Message Parts that is one of the 
> > fundamental concepts behind WSDL. What is not clear to me is 
> > how you would decide to use separate message parts rather 
> > than the transparent message parts that PASWA seems to suggest.
> I think you just define messages with XSD types and anything that is of
> type xsd:base64Binary is a potential attachment.
> <DB>So can you see any benefit in treating attachments as "first class
> citizens" - here's a use case. For deliveries of goods within the same
> country, just a shipping note is required. However, for international
> deliveries you will often need a customs declaration to go along with the
> shipping note (ideally in the same message) and the customs declaration must
> reference the shipping note.</DB>
> >  
> > Finally, Commerce One, has developed a soap header spec and 
> > an open source, royalty free implementation (called DocSOAP) 
> > that uses a Manifest element that is an extension of the 
> > ideas on a Manifest from ebXML Messaging. It covers much (but 
> > not all) of the same ground covered by PASWA whilst solving 
> > (we think) the problems with the use of content id described 
> > above as well as tying it in more closely to WSDL.
> I'm not convinced there is a problem with cids.
> <DB>There isn't if the content of the SOAP Body and the rest of the SOAP
> message is handled by the same software. On the other hand, if they are
> handled by separate software I think there is.</DB>
> >  
> > The key difference between the SOAP Manifest and PASWA is 
> > that the XML document references the WSDL partname for an 
> > attachment and then the Manifest element in the SOAP header 
> > ties the partname to the content id of the actual part which 
> > can be in the SOAP Body, in an attachment or even externally 
> > to the message on the web. This means that the content ids 
> > can be changed at any time and only the manifest element 
> > needs to change.
> With PASWA the content-ids can be changed at any time and only the
> corresponding xbinc:Include/@href needs to be changed.
> <DB>I think the ideas behind PASWA for handling large binary data (or just
> any large data binary or not) are great and very useful when the all of the
> SOAP message is being handled at the same time. However, I don't think it
> works well when you need to handle/process the content separately from the
> rest of the message or when you want to handle more than one document at the
> same time. Thoughts?</DB>
> Gudge
Received on Monday, 19 May 2003 04:59:11 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 22:01:23 UTC