- From: Soltysik, Seumas <Seumas.Soltysik@iona.com>
- Date: Tue, 30 Sep 2003 12:45:41 -0400
- To: <noah_mendelsohn@us.ibm.com>, "J. Barton, John" <John_Barton@hpl.hp.com>
- Cc: "Jacek Kopecky" <jacek.kopecky@systinet.com>, "Mark Nottingham" <mark.nottingham@bea.com>, "XMLP Dist App" <xml-dist-app@w3.org>, <xml-dist-app-request@w3.org>
Hi All, It seems to me that the whole way of thinking about this streaming issue is somewhat backwards. As opposed to trying to figure out how to integrate some kind of streaming solution into the SOAP w/attachments framework, perhaps we should focus on encouraging more of a REST philosophy. It would seem that in cases where a client wanted access to large chunks of binary data, that we should encourage a usage pattern whereby a user receives a URI to the data as opposed to the actual data itself. This would provide maximum flexibility to the client and would make the streaming issue somewhat moot in that it would be up to the client how and when they wanted to access the data. One could almost argue that if we think that this is the correct paradigm to push when accessing binary data, then there is really no need for the SOAP with Attachments specification. The one scenario where I SOAP w/attachments still seems to make sense is when you have a lightwieght client who is trying to push binary data to another SOAP node, possibly another lightwieght client or a server for storage. In this scenario it is not realistic for such a client as a digital camera to serve as both client and server by pushing a URI and then serving up the data at some later time period. It seems to me that we should focus the SOAP w/attachment spec on use cases that involve lightweight clients pushing data to other SOAP nodes and use a REST philosophy when dealing with clients requesting binary data from servers. Regards, Seumas -----Original Message----- From: noah_mendelsohn@us.ibm.com [mailto:noah_mendelsohn@us.ibm.com] Sent: Sunday, September 28, 2003 12:58 PM To: J. Barton, John Cc: Jacek Kopecky; Mark Nottingham; XMLP Dist App; xml-dist-app-request@w3.org Subject: Re: XMLP-UC-6 reformulation - simple streaming use case John Barton writes: >> Thanks for your detailed and thoughtful reply. >> I'll rearrange what you said and add some >> stuff...hopefully it will help ;-) Thank you. Yes, I think it does help. While I don't necessarily agree with (or in a few cases understand) every nuance of what you've written below, I think it's overall consistent with the sort of analysis I think we have to do to justify any support of streaming...and indeed, that was my main point. There are lots of potentially important use cases, but plenty of users ready to say "surely this is simple: if you just bake in support for my use case we'll be all set." I think we should either skip streaming in this round as not making an 80/20 cut, or we should put some energy into getting concensus on the range of use cases likely to be of interest over time. I think your note below very much contributes to that discussion, as I hope mine did. Having done such a use case analysis, I think we can decide how much if any support to put into each of the three layers of MTOM. As I said earlier, there may be value in making sure that the abstract model does as little as practical to preclude streaming of various sorts, even if we decide that our initial binding supports a smaller set of scenarios (if any). Thank you! ------------------------------------------------------------------ Noah Mendelsohn Voice: 1-617-693-4036 IBM Corporation Fax: 1-617-693-8676 One Rogers Street Cambridge, MA 02142 ------------------------------------------------------------------ "John J. Barton" <John_Barton@hpl.hp.com> Sent by: xml-dist-app-request@w3.org 09/25/2003 12:54 PM To: Noah Mendelsohn/Cambridge/IBM@Lotus cc: Jacek Kopecky <jacek.kopecky@systinet.com>, Mark Nottingham <mark.nottingham@bea.com>, XMLP Dist App <xml-dist-app@w3.org> Subject: Re: XMLP-UC-6 reformulation - simple streaming use case Noah, Thanks for your detailed and thoughtful reply. I'll rearrange what you said and add some stuff...hopefully it will help ;-) There seems to be four related issues: 1) senders that can, can't, or won't count bytes. 2) 0, 1, or more than one binary attachment. 3) incremental vs batch processing of the message 4) spatial relationship between SOAP and the attachments or among the latter. Can we understand how these interact? Let's look at counting bytes and number of attachments by example: PrintAPhoto: 1 binary data, can count bytes. StereoXRay: Multiple binary data, can count bytes. LazyPrintServer: Multiple binary data, won't count bytes. Internet Radio: Exactly one binary data, can't count bytes. Internet Multimedia: >1 binary data, can't count bytes. Then let's ask "Can we process these incrementally"? PrintAPhoto: yes, if I know before I get the image bits where they need to be rendered. StereoXRay: yes, if I know before I get the image bits which database will receive them. LazyPrintServer: no I cannot decide if the job is possible until it dies. Internet Radio: yes, if I know before I get the audio that I am going to decode frames and emit them. Internet Multimedia: yes, as for radio if the binary is interleaved. From these examples we observe that incremental processing depends on message structure: if we put the processing commands and sizes in front and allow the server to interleave content, we can cover a lot of ground. I believe that the first bit is what Noah means by putting the envelop first. Once we do that, interleaving content is easy. There are two more issues that complicate this picture: 5) Digital signatures, 6) embedded XML + validation. Obviously any operation that must be performed over the entire message before processing prevents incremental processing. In searching out the 80/20 spot, I believe we should avoid solutions that insist on whole-message preprocessing. John. At 05:30 PM 9/23/2003 -0400, noah_mendelsohn@us.ibm.com wrote: >John, let me try and respond to the various sections of your note: > >John Barton writes: > > > Noah, > > > > Unfortunately I am once again confused by the use of > > the word "streaming". Maybe I missed a clarification > > sometime back? Mark's formulation might be incomplete > > but at least I understand its terms ;-). > >I use "streaming" to refer to the broad range of scenarios in which a >sender and/or receiver needs to prepare or process the message >incrementally. In other words, any alternative to the situation in which >the entire message can be buffered both before sending and prior to start >of processing following receipt. In the general case for large messages, >such streaming allows for overlap of sender and receiver processing of the >same message, though such overlap is not required and may only be achived >in some cases. While there are probably more formal definitions out >there, I think this is consistent with general usage in the industry. > >So, my use of the word involves a potentially broad range of use cases >including but not limited to situations in which the XML SOAP envelope >itself is very large, some sort of attachment is very large, where there >is more than one large attachment (e.g. a video and an audio stream to be >sent in parallel as generated, though I am not pushing hard on issues of >isochrony here), situations such as satellite transmission in which there >is value in overlapping processing at sender and receiver, etc. I believe >that analogs of each of these scenarios have proven crucial at one time or >another with earlier messaging systems. > > > If I look around eg W3C almost all the uses of the term > > "streaming" are for audio and video. I did see this > > however: > > > > > SteveS: not having to download the whole package > > before unpacking part of it--streaming. Is that the > > meaning of "streaming" in this context? If so, then it > > is exactly what we need to make some of the use cases > > feasible. > >I was not referring to any particular W3C characterization of streaming, >but to the broad range of behaviors that people may at least think they >want to see for SOAP in one context or another. We don't have to support >them all, but I think we have to consider many and choose carefully. > > > > I also found your second paragraph confusing. Let me > > try to pick this apart: > > > > >* The HTTP binding provided with MTOM either > > > (a) need not be optimized for > > >streaming > > > > This reads like a non-requirement to me: why list the > > things the binding is not optimized for? Well maybe the > > OR case is the one I want... > >Mea culpa, it is of course a non-requirement. What I meant was: let me >offer two alternative formulations for consideration by the workgroup. > >(a) While we may agree on the desirability of havning an abstract model >that facilitates streaming when the binding wishes to do so, let's keep >our initial MTOM HTTP binding simple. It's not clear to me that we >understand the requirements well enough for streaming to choose well, so >let's keep it simple, as was done for SOAP 1.2. In other words, let's not >require ourselves to produce a streaming binding in association with this >version of MTOM. Of course, MTOM like SOAP allows you to create your own >bindings, and those might indeed facilitate streaming. > >That's option (a) for consideration. The alternative I proposed was (b): > > > > or ( b) SHOULD provide for accessibity to > > > non-optimzed envelope information ahead > > > of the serializations of large binary objects > > > > Well I think I understand this one: you are going to > > tell me the size of stuff before you send it: I like > > it. > >No, that's not what it said, though that is indeed an interesting design >point for yet another set of use cases. What this one said is: make sure >that the non-optimized >envelope< comes first. I.e. MTOM allows you to >optimize parts of the envelope by taking them out of line and replacing >them with xbinc:include. I was informally referring to the result of that >as the "unoptimized" (part of) the envelope. In other words, you get the >complete <soap:envelope> and all its children before any of the binary >parts. That represents a form of streaming, insofar as it allows both >sender and receiver to deal with the envelope before sending/receiving the >so-called attachments. > >FWIW: requiring a length at the head of messsage segment tends to move >streaming headaches from the receiver to the sender, at least in the case >where the sender itself does not know the length of the data in advance. I >think there are 2 or 3 use cases hidden in this area: you want to make >life easy for the receiver, and the sender happens to know the lenght; you >want to make life easy for the receiver even if the sender has to buffer a >gigabyte to determine the length; you want to make life easy for the >sender, so you make no requirement to send a length ahead of the data. >Again, I think that all of these are legitimate design points for one use >case or another. Indeed, it's the range of such requirements that >suggests to me that we should go slow on adding streaming features. > > > >and SHOULD > > >further provide for streaming in the case that only one large object >has > > >been optimized > > > > Huh? Why one? and anyway what is streaming? > >Well, this was an attempt to find an 80/20 point for those who have, say, >a large XRay file as a GIF or JPEG, and want to stream that as well as the >envelope. By stream I mean, be able to send out some of the bytes of the >XRay before all of them are available at the sender and/or to be able to >begin processing of the first few raster lines at the receiver before the >whole thing is received (and perhaps before the sender has even sent the >tail.) Considre, for example, the case where some scanning sensor is >sending out the raster lines for the XRay as they become available, and we >are sending them out in a SOAP message in parallel with the scanning of >additional lines. > >Why one object only? Because I can see straightforward implementations of >that. If there are two xrays streaming in parallel off two scanners >(stereo image?), and you don't want to wait for all of the first one >before you can make progress on the second, then you are in the business >of interleaving them. That's going to be important for some use cases >someday, but I was making the suggestion that interleaving might not make >an 80/20 cut for a SOAP binding in the next few months. > > >If you > > tell me enough information ahead of the bits, then > > either I can accept your TCP/IP packets or refuse them. > > Given that we are in HTTP these are the only two things > > I can do right? I'd rather read something like: > >I think it depends on the level you're thinking about. At some level, all >of TCP/IP streams (in the sense I mean) because it comes in one packet at >a time, and you can always try to finish with one before accepting (or >sending) the next. The question is whether that's realistic at the next >level up. To be perfectly rigorous, you can't for example process the >start of a SOAP envelope without seeing the end, because you don't even >know whether it's well-formed until you see the end tag for ></soap:envelope>. If that doesn't show up in the right place, you've got >no Infoset, and no Envelope, therefore "no SOAP" (pun intended.) XML >doesn't stream, in this sense, and SOAP uses XML (modulo the permission to >use optimistic concurrency and roll back all side effects once you >discover that the envelope is poorly formed.) Of course, many >implementations will start work early, and will indeed roll back when the >message proves to be not well formed. Still, I think you'd be making a >mistake to do a database commit based on a SOAP message until you'd seen >the end tags. > >Similarly, if I want SOAP to be robust enough to make progress on 2 or 3 >large streaming attachments to the same message in parallel, then I can't >just argue at the IP level. I've got to look to Multipart MIME, DIME, or >some level that will allow me to express the interleaving of those >streams. I think that's a very important use case for someday, but I'm >proposing we not "go there" for now. > > > > > > > ______________________________________________________ > > John J. Barton email: John_Barton@hpl.hp.com > > http://www.hpl.hp.com/personal/John_Barton/index.htm > > MS 1U-17 Hewlett-Packard Labs > > 1501 Page Mill Road phone: (650)-236-2888 > > Palo Alto CA 94304-1126 FAX: (650)-857-5100 > >Thanks for your patience. Hope this is helpful. > >Noah > >------------------------------------------------------------------ >Noah Mendelsohn Voice: 1-617-693-4036 >IBM Corporation Fax: 1-617-693-8676 >One Rogers Street >Cambridge, MA 02142 >------------------------------------------------------------------ ______________________________________________________ John J. Barton email: John_Barton@hpl.hp.com http://www.hpl.hp.com/personal/John_Barton/index.htm MS 1U-17 Hewlett-Packard Labs 1501 Page Mill Road phone: (650)-236-2888 Palo Alto CA 94304-1126 FAX: (650)-857-5100
Received on Tuesday, 30 September 2003 12:51:32 UTC