- From: Henrik Frystyk Nielsen <frystyk@microsoft.com>
- Date: Wed, 18 Oct 2000 14:57:10 -0700
- To: "Joe Lapp" <jlapp@webmethods.com>, <xml-dist-app@w3.org>
Hi Joe, Just so that you know, your original posting to the SOAP mailing list is recorded on the SOAP issues list as item #21 at http://msdn.microsoft.com/xml/general/soapspec_issues.asp pointing to http://discuss.develop.com/archives/wa.exe?A2=ind0008&L=soap&F=&S=&P=41789 This also then led to quite a bit of discussion - the full thread is at http://discuss.develop.com/archives/wa.exe?A1=ind0008&L=soap#46 People might want to catch up on this thread as well. Henrik -----Original Message----- From: Joe Lapp [mailto:jlapp@webmethods.com] Sent: Tuesday, October 17, 2000 16:27 To: xml-dist-app@w3.org Subject: Issues with Packaging Application Payloads On August 18th I posted to the SOAP discussion list a set of issues that I had with how SOAP packages application payloads. Most of these issues apply to the use of an XML envelope, so this group will face these questions in the development of W3C XML Protocol. Since the group is in the requirements phase, I thought it best to make sure the issues are known, so that they can help feed requirements. By presenting this I don't mean to give any statement of webMethods' position on either SOAP or XML Protocol. webMethods has a habit of supporting whatever protocols our clients need. My job as engineer is to make what it means to support a protocol as painless as possible to both webMethods developers and those clients of ours who are inclined to use the protocol. You'll find that 8/18/2000 post duplicated below, in the <REPOST> tag. Afterwards, I address some of the common objections. <REPOST> I'm partly encountering and partly anticipating a number of issues related to how SOAP packages application payloads in XML documents. These issues primarily apply to the use of SOAP with application-level payloads and do not surface when SOAP is used strictly for RPC. I'm providing the issues here to make them public and available for discussion, but I think it would be most effective to resolve the issues within a standards body that is generous enough to bring SOAP under its wing. You'll find the issues listed below. Please feel free to provide corrections to any erroneous understanding I may have and suggestions for how to deal with some of these situations. Here you go: <ISSUES> (1) Infrastructure header data and application payload data may be put in the same XML document (the SOAP Envelope). When applications unintentionally dump non-wellformed XML payloads into this document, the entire document is non-wellformed. A robust server must protect itself from client errors and cannot trust clients to deliver only wellformed documents. Should an application or protocol choose to put payloads in the XML envelope, it seems that this would create a number of problems related to error handling: (1a) Countless XML tools out there use parse-trees (instead of events) and may not be capable of representing just the wellformed portion of a document. Middleware based on such tools will not be able to act on the SOAP headers that would otherwise apply, even for some content errors that are not wellformedness errors. Likewise, recipient applications will not be able to engage in the error management required by those applications for application-level errors. (1b) Even event-based XML tools suffer from not being able to reliably deliver application payloads that are wellformed but which follow a non-wellformed payload (within the body of the same XML envelope). Errors that may be recoverable at the application level are not given the opportunity to recover. Errors that may be ignored at the application level will not be ignored. For example, should the application payloads be semantically independent, as with an application-level batching mechanism, SOAP would create dependencies between them. Issues (1a) and (1b) are a direct result of the XML 1.0 Specification's requirement that wellformedness errors be fatal errors within an XML processor. In particular, the specification says that "Once a fatal error is detected, however, the processor must not continue normal processing (i.e., it must not continue to pass character data and information about the document's logical structure to the application in the normal way)." To relate this to SOAP, whenever an application payload is placed into the envelope body, an assertion is made that the payload constitutes wellformed XML, and the SOAP envelope ends up creating dependencies among data that did not exist prior to enveloping the data. This violates the clear separation of layers that most protocols have. (2) Performance is another issue that seems to surface when infrastructure and application data are put into the same XML document. Consider SOAP's requirement that when a header targeted for a given actor reaches that actor, the header must be consumed (usually meaning removed) from the document before the document may proceed. If the SOAP header is 4K in size and the SOAP body has a 1MB payload, you wouldn't want to parse the document, remove one element from the beginning of the document, and then regenerate the 1+MB document to forward to the next destination. Yet most tools will require this approach -- even most event-based tools. It is possible to create a more specialized parser that will provide a portion of the document as XML and still allow concatenating the remainder as text without going through the parsing process, but that's kind of a tall order for the everyday parser we hope will work with SOAP. (3) SOAP does not sufficiently address the issues associated with mixing MIME-based packaging and XML-based packaging. I have identified four distinct locations in a SOAP message where application XML-based data could be placed: in the envelope body, in the envelope header, as an immediate child of the envelope root, and as a MIME part following the envelope. Section 4.3.1 of SOAP 1.1 asserts a "semantic" equivalence between payloads and headers under certain circumstances, which is what allows an application payload to appear in the header. Since these two payload areas are semantically equivalent at the SOAP level, SOAP need not provide a distinction between these payloads at the application level. However, the question remains as to whether SOAP provides applications with the ability to distinguish among the remaining payload locations. Should SOAP allow the application to know or select the envelope in which each payload is packaged? Is it reasonable to expose this packaging detail at the application level? Can an application distinguish between MIME headers and XML attributes (eg. encodingStyle) when examining or specifying per-payload properties, or would applications also require an awareness of these distinctions? The specification "SOAP Messages with Attachments," by John J. Barton (Hewlett Packard Labs) and Satish Thatte (Microsoft), does an excellent job of specifying how one puts MIME attachments in a SOAP message along with some of the interactions between payloads, but it does not address these sorts of envelope-transparency issues. (This specificiation was posted to http://discuss.develop.com/soap.html on July 7, 2000, but the attachment does not seem to be available from the archive.) (4) Many RFCs and standards have been created to specify how one uses MIME headers to package MIME parts for a particular purpose. Some specify document types, some specify character encodings, some specify that the data is encrypted, some specify the presence of signatures or certificates needed to interpret the message -- in general any information needed for middleware software to communicate payload-specific handling. If we put data in the body portion of the envelope, we forsake all the benefits available through tools that implement these standards. Is it reasonable to put payloads in one place only when those standards aren't needed and in the other when they are needed? SOAP adds the encodingStyle attribute to each payload. Won't we also sometimes need that with a MIME part? </ISSUES> Okay, assuming that my issues are substantially correct (which may be a completely false assumption), what benefits are there to allowing an XML packaging mechanism at all in SOAP for application payloads? Why not just define SOAP in a packaging-independent way and provide a binding for MIME (or its HTTP variant) for now, until the W3C produces a well-thought-out specification for XML packaging? The only answer I can think of is a good one, but not one that stands up to the requirements of high-end B2B ecommerce: that life is easier for developers if they don't have to learn and work with yet one more technology -- MIME. Joe Lapp Principal Architect webMethods, Inc. P.S. This email was substantially critical of SOAP, or at least of SOAP's packaging of application payloads in XML, so I want to end by saying that SOAP 1.1 is the most impressive XML specification I've had the pleasure to read. I have referred many people to it to use as a model for their own XML specifications (even including some Microsofties). It is very easy to read, very easy to understand, and very succinct. I'm impressed with the extensibility mechanism created for the SOAP header -- I'm especially impressed that the designers understood that the X in XML does not by itself give them extensibility. I'm most impressed with how little documentation is required to specify a working messaging protocol. ebXML and RosettaNet RNIF have much to learn from the SOAP and BizTalk specifications. </REPOST> Okay, let's look at some of the objections I have received: (A) I say that these issues apply more to non-RPC uses of the SOAP protocol than to RPC uses, and one objection I've heard is that the SOAP envelope doesn't distinguish between the two, so that if the issues don't exist for one mode they shouldn't exist for the other. I have a few responses to this "objection": First, the issues stand by themselves, so evaluate for yourself whether or not they apply to RPC. The next two responses just explain why I bothered to make this assertion. Second, the SOAP spec defines the RPC behavior but not the other application-specific behavior. To be SOAP-compliant requires conformance with the SOAP spec, so by definition, RPC should behave well between two SOAP-compliant nodes. If the nodes aren't SOAP compliant, then you wouldn't necessarily expect communication anyway. Finally, the RPC semantics and marshalling can be made independent of the application either by using an introspective language like Java or by using a single tool for generating the stubs of all (or most applications). There will be far more applications than infrastructure software and stub-generators, so it will be easier to ensure that the infrastructure and stub-generators behave well, but near impossible to ensure that all applications behave well. (B) Another objection: The software infrastructure will be responsible for transmitting wellformed XML. Most of the XML will therefore be wellformed, and these issues will not arise frequently enough to bother addressing them. Response: Note that this is only an objection to point (1) and makes no statement regarding points (2) through (4). I believe that this objection makes a false assumption. First, I doubt that everybody will be using the same robust implementations of SOAP (or XML Protocol). There will always be the hacker types and those who think they have value-add to give. But even if we grant this scenario -- that everybody uses proven implementations -- we still have another problem. Should an application ever create the XML that needs to be delivered, it the protocol stack would have to reparse that XML on the client-side before sending it, just to provide the robustness guarantee. That's another performance issue. I seriously doubt that all protocol clients are going to enforce the reliability of the XML. (C) Another objection: We will have a better world if the server infrastructure enforces wellformedness on behalf of the applications. This reduces the work of the applications, and it pushes error detection and error handling closer to its source. Response: I agree in principle with this objection, but I disagree that XML allows us to do this properly, and I disagree that all protocol stacks should be required to provide this enforcement. XML does not allow us to do this properly because the first wellformedness error is officially required to be a fatal error, by the XML 1.0 spec itself. There is no standard handling for fatal errors -- no standard way to identify the kind of error or even a requirement that the kind of error be identifiable. When such an error occurs, many XML parsers will force the recipient to just reject the message. The recipient may not be able to log the message headers, even if they are wellformed (as I assume) or inform the target application, should the target application need to engage in error management. If the message contains is a batch of independent commands (I don't mean independent messages), the application won't be able to handle the wellformed ones independently. (D) Another objection: An XML envelope gives us the advantage of extensibility that we wouldn't otherwise have. Response: I see this "extensibility" as extensibility of the headers, not of the payloads. MIME is pretty flexible with its payloads, allowing heirarchy and arbitrary content. The issue is attributing the message as a whole and attributing the individual payloads. One could address the extensibility of the message headers by having an XML document represent the headers. This is done in ebXML and RNIF (RosettaNet). One could address the extensibility of the payload headers through an XML manifest or an optional XML attachment that may prefix any given payload. I'm sure other possibilities exist as well. (E) Another objection: The W3C decided not to do XML packaging, but XML packaging confers enough benefits that it ought to be addressed to some extent, even if only minimally. XML Protocol is the right place for that. Benefits include the ability to apply XML tools to the packaged message; MIME tools do not exist in such diversity. Response: The tools argument works for me to some degree, but in my mind the negatives outweight the positives. XML packaging (of the sort being discussed for a protocol) needs a lot of thought and can't be considered without first understanding the negatives. Also, I don't find it relevant to our discussion that the W3C decided to abandon XML packaging. I don't really know what that working group meant by "XML packaging." The issue is that SOAP (and perhaps XML Protocol) is defining an XML envelope for packaging application payloads, even if only the XML payloads. The term 'packaging' may be overloaded, so let's focus on the semantics rather than the words. And since, as so far speced, the XML envelope can only handle the XML payloads, it seems that we aren't simply making message packages more amenable for use by XML tools. We are now requiring that both MIME tools and XML tools be present. This is more onerous than requiring the presence and use of just one tool set (or API set). ====== Okay, enough of that. If you got this far, thank you very much for giving me your time. I'll live with whatever the working group comes up with; I just wanted to make sure that the issues are heard and known and that the decisions made are fully educated ones. Joe Lapp Principal Architect webMethods, Inc. P.S. Randy Waldrop is our official working group member and will be formally representing webMethods' interests. I don't plan on participating in this discussion (too much else to do), and Randy is free to defend or attack or ignore these issues as he pleases. This group has a huge amount of expertise, and I trust that you will make appropriate use of these points, whatever use that may be.
Received on Wednesday, 18 October 2000 17:57:33 UTC