W3C home > Mailing lists > Public > xml-dist-app@w3.org > October 2003

RE: XMLP-UC-6 reformulation - simple streaming use case

From: <noah_mendelsohn@us.ibm.com>
Date: Mon, 13 Oct 2003 14:06:54 -0400
To: Seumas.Soltysik@iona.com
Cc: "Jacek Kopecky" <jacek.kopecky@systinet.com>, "J. Barton, John" <John_Barton@hpl.hp.com>, "Mark Nottingham" <mark.nottingham@bea.com>, "XMLP Dist App" <xml-dist-app@w3.org>
Message-ID: <OF0D8B98EB.BCDD679C-ON85256DBE.0062E850@lotus.com>

I thought I had replied to this note, but apparently not.   My apologies 
for the delay.

I am a fan of REST, and indeed MTOM (like the rest of SOAP...pun intended) 
can be used in a RESTful manner if desired.  For example, one could do a 
SOAP request with WebMethod GET for my picture, and receive back from the 
HTTP GET (presuming the http binding) an MTOM-optimized SOAP envelope with 
a media-type of something along the lines of application/mtom+soap+xml 
(note that we are still debating the use of media types for MTOM, but this 
is where I'd like to land.)

On the other hand, MTOM is specifically designed for the cases where you 
want the binary objects to travel with the message.  There are a variety 
of reasons for this, including disconnected scenarios (your PDA is going 
out of radio range), scenarios in which messages go to nodes without web 
access (let's say a bank wants to record the image of my signature with a 
transaction...it's not at all clear the banks database system can afford 
to make a call back to do a GET from some point of sale terminal where my 
signature was logged.) Etc.  Indeed, all of this can be done in a RESTful 
way, insofar as the whole MTOM message is a perfectly reasonable POST to 
update the banking system resource.  I'm not convinced that requiring 
every data source to act as a server that accepts get callbacks, and that 
persists the data as long as such callbacks might come, is practical. MTOM 
is for the cases where you want the data to be part of the message.

------------------------------------------------------------------
Noah Mendelsohn                              Voice: 1-617-693-4036
IBM Corporation                                Fax: 1-617-693-8676
One Rogers Street
Cambridge, MA 02142
------------------------------------------------------------------







"Soltysik, Seumas" <Seumas.Soltysik@iona.com>
09/30/2003 12:45 PM

 
        To:     Noah Mendelsohn/Cambridge/IBM@Lotus, "J. Barton, John" 
<John_Barton@hpl.hp.com>
        cc:     "Jacek Kopecky" <jacek.kopecky@systinet.com>, "Mark Nottingham" 
<mark.nottingham@bea.com>, "XMLP Dist App" <xml-dist-app@w3.org>, 
<xml-dist-app-request@w3.org>
        Subject:        RE: XMLP-UC-6 reformulation - simple streaming use case


Hi All,
It seems to me that the whole way of thinking about this streaming issue 
is somewhat backwards. As opposed to trying to figure out how to integrate 
some kind of streaming solution into the SOAP w/attachments framework, 
perhaps we should focus on encouraging more of a REST philosophy.

It would seem that in cases where a client wanted access to large chunks 
of binary data, that we should encourage a usage pattern whereby a user 
receives a URI to the data as opposed to the actual data itself. This 
would provide maximum flexibility to the client and would make the 
streaming issue somewhat moot in that it would be up to the client how and 
when they wanted to access the data. One could almost argue that if we 
think that this is the correct paradigm to push when accessing binary 
data, then there is really no need for the SOAP with Attachments 
specification. 

The one scenario where I SOAP w/attachments still seems to make sense is 
when you have a lightwieght client who is trying to push binary data to 
another SOAP node, possibly another lightwieght client or a server for 
storage. In this scenario it is not realistic for such a client as a 
digital camera to serve as both client and server by pushing a URI and 
then serving up the data at some later time period. It seems to me that we 
should focus the SOAP w/attachment spec on use cases that involve 
lightweight clients pushing data to other SOAP nodes and use a REST 
philosophy when dealing with clients requesting binary data from servers.

Regards,
Seumas

-----Original Message-----
From: noah_mendelsohn@us.ibm.com [mailto:noah_mendelsohn@us.ibm.com]
Sent: Sunday, September 28, 2003 12:58 PM
To: J. Barton, John
Cc: Jacek Kopecky; Mark Nottingham; XMLP Dist App;
xml-dist-app-request@w3.org
Subject: Re: XMLP-UC-6 reformulation - simple streaming use case



John Barton writes:

>>  Thanks for your detailed and thoughtful reply. 
>> I'll rearrange what you said and add some 
>> stuff...hopefully it will help ;-)

Thank you.  Yes, I think it does help.  While I don't necessarily agree 
with (or in a few cases understand) every nuance of what you've written 
below, I think it's overall consistent with the sort of analysis I think 
we have to do to justify any support of streaming...and indeed, that was 
my main point.  There are lots of potentially important use cases, but 
plenty of users ready to say "surely this is simple:  if you just bake in 
support for my use case we'll be all set."  I think we should either skip 
streaming in this round as not making an 80/20 cut, or we should put some 
energy into getting concensus on the range of use cases likely to be of 
interest over time.  I think your note below very much contributes to that 

discussion, as I hope mine did. 
Having done such a use case analysis, I think we can decide how much if 
any support to put into each of the three layers of MTOM.  As I said 
earlier, there may be value in making sure that the abstract model does as 

little as practical to preclude streaming of various sorts, even if we 
decide that our initial binding supports a smaller set of scenarios (if 
any). 

Thank you!

------------------------------------------------------------------
Noah Mendelsohn                              Voice: 1-617-693-4036
IBM Corporation                                Fax: 1-617-693-8676
One Rogers Street
Cambridge, MA 02142
------------------------------------------------------------------







"John J. Barton" <John_Barton@hpl.hp.com>
Sent by: xml-dist-app-request@w3.org
09/25/2003 12:54 PM

 
        To:     Noah Mendelsohn/Cambridge/IBM@Lotus
        cc:     Jacek Kopecky <jacek.kopecky@systinet.com>, Mark 
Nottingham 
<mark.nottingham@bea.com>, XMLP Dist App <xml-dist-app@w3.org>
        Subject:        Re: XMLP-UC-6 reformulation - simple streaming use 
case



Noah,

   Thanks for your detailed and thoughtful reply.  I'll rearrange what
you said and add some stuff...hopefully it will help ;-)

   There seems to be four related issues:
      1) senders that can, can't, or won't count bytes.
      2) 0, 1, or more than one binary attachment.
      3) incremental vs batch processing of the message
      4) spatial relationship between SOAP and the attachments or
         among the latter.
Can we understand how these interact?

Let's look at counting bytes and number of attachments by example:
   PrintAPhoto: 1 binary data, can count bytes.
   StereoXRay: Multiple binary data, can count bytes.
   LazyPrintServer: Multiple binary data, won't count bytes.
   Internet Radio: Exactly one binary data, can't count bytes.
   Internet Multimedia: >1 binary data, can't count bytes.

Then let's ask "Can we process these incrementally"?
   PrintAPhoto: yes, if I know before I get the image bits where they
        need to be rendered.
   StereoXRay: yes, if I know before I get the image bits which database
        will receive them.
   LazyPrintServer: no I cannot decide if the job is possible until it 
dies.
   Internet Radio: yes, if I know before I get the audio that I am going
       to decode frames and emit them.
   Internet Multimedia: yes, as for radio if the binary is interleaved.

 From these examples we observe that incremental processing
depends on message structure: if we put the processing commands
and sizes in front and allow the server to interleave content, we can
cover a lot of ground.  I believe that the first bit is what Noah means
by putting the envelop first. Once we do that, interleaving content is
easy.

There are two more issues that complicate this picture:
    5) Digital signatures,
    6) embedded XML + validation.
Obviously any operation that must be performed over the entire
message before processing prevents incremental processing.
In searching out the 80/20 spot, I believe we should avoid solutions
that insist on whole-message preprocessing.

John.


At 05:30 PM 9/23/2003 -0400, noah_mendelsohn@us.ibm.com wrote:

>John, let me try and respond to the various sections of your note:
>
>John Barton writes:
>
> > Noah,
> >
> > Unfortunately I am once again confused by the use of
> > the word "streaming".  Maybe I missed a clarification
> > sometime back?  Mark's formulation might be incomplete
> > but at least I understand its terms ;-).
>
>I use "streaming"  to refer to the broad range of scenarios in which a
>sender and/or receiver needs to prepare or process the message
>incrementally.  In other words, any alternative to the situation in which
>the entire message can be buffered both before sending and prior to start
>of processing following receipt.  In the general case for large messages,
>such streaming allows for overlap of sender and receiver processing of 
the
>same message, though such overlap is not required and may only be achived
>in some cases.  While there are probably more formal definitions out
>there, I think this is consistent with general usage in the industry.
>
>So, my use of the word involves a potentially broad range of use cases
>including but not limited to situations in which the XML SOAP envelope
>itself is very large, some sort of attachment is very large, where there
>is more than one large attachment (e.g. a video and an audio stream to be
>sent in parallel as generated, though I am not pushing hard on issues of
>isochrony here), situations such as satellite transmission in which there
>is value in overlapping processing at sender and receiver, etc.  I 
believe
>that analogs of each of these scenarios have proven crucial at one time 
or
>another with earlier messaging systems.
>
> > If I look around eg W3C almost all the uses of the term
> > "streaming" are for audio and video.  I did see this
> > however:
> >
> > > SteveS: not having to download the whole package
> > before unpacking part of it--streaming.  Is that the
> > meaning of "streaming" in this context?  If so, then it
> > is exactly what we need to make some of the use cases
> > feasible.
>
>I was not referring to any particular W3C characterization of streaming,
>but to the broad range of behaviors that people may at least think they
>want to see for SOAP in one context or another.  We don't have to support
>them all, but I think we have to consider many and choose carefully.
>
>
> > I also found your second paragraph confusing. Let me
> > try to pick this apart:
> >
> >  >* The HTTP binding provided with MTOM either
> >  > (a) need not be optimized for
> >  >streaming
> >
> > This reads like a non-requirement to me: why list the
> > things the binding is not optimized for? Well maybe the
> > OR case is the one I want...
>
>Mea culpa, it is of course a non-requirement.  What I meant was:  let me
>offer two alternative formulations for consideration by the workgroup.
>
>(a) While we may agree on the desirability of havning an abstract model
>that facilitates streaming when the binding wishes to do so, let's keep
>our initial MTOM HTTP binding simple.  It's not clear to me that we
>understand the requirements well enough for streaming to choose well, so
>let's keep it simple, as was done for SOAP 1.2.  In other words, let's 
not
>require ourselves to produce a streaming binding in association with this
>version of MTOM.  Of course, MTOM like SOAP allows you to create your own
>bindings, and those might indeed facilitate streaming.
>
>That's option (a) for consideration.  The alternative I proposed was (b):
>
> >  > or ( b) SHOULD provide for accessibity to
> >  > non-optimzed envelope information ahead
> >  > of the serializations of large binary objects
> >
> > Well I think I understand this one: you are going to
> > tell me the size of stuff before you send it: I like
> > it.
>
>No, that's not what it said, though that is indeed an interesting design
>point for yet another set of use cases.  What this one said is:  make 
sure
>that the non-optimized >envelope< comes first.  I.e. MTOM allows you to
>optimize parts of the envelope by taking them out of line and replacing
>them with xbinc:include.  I was informally referring to the result of 
that
>as the "unoptimized" (part of) the envelope.  In other words, you get the
>complete <soap:envelope> and all its children before any of the binary
>parts.  That represents a form of streaming, insofar as it allows both
>sender and receiver to deal with the envelope before sending/receiving 
the
>so-called attachments.
>
>FWIW: requiring a length at the head of messsage segment tends to move
>streaming headaches from the receiver to the sender, at least in the case
>where the sender itself does not know the length of the data in advance. 
I
>think there are 2 or 3 use cases hidden in this area:  you want to make
>life easy for the receiver, and the sender happens to know the lenght; 
you
>want to make life easy for the receiver even if the sender has to buffer 
a
>gigabyte to determine the length;  you want to make life easy for the
>sender, so you make no requirement to send a length ahead of the data.
>Again, I think that all of these are legitimate design points for one use
>case or another.  Indeed, it's the range of such requirements that
>suggests to me that we should go slow on adding streaming features.
>
> >  >and SHOULD
> >  >further  provide for streaming in the case that only one large 
object
>has
> >  >been optimized
> >
> > Huh? Why one?  and anyway what is streaming?
>
>Well, this was an attempt to find an 80/20 point for those who have, say,
>a large XRay file as a GIF or JPEG, and want to stream that as well as 
the
>envelope.  By stream I mean, be able to send out some of the bytes of the
>XRay before all of them are available at the sender and/or to be able to
>begin processing of the first few raster lines at the receiver before the
>whole thing is received (and perhaps before the sender has even sent the
>tail.)  Considre, for example, the case where some scanning sensor is
>sending out the raster lines for the XRay as they become available, and 
we
>are sending them out in a SOAP message in parallel with the scanning of
>additional lines.
>
>Why one object only?  Because I can see straightforward implementations 
of
>that.  If there are two xrays streaming in parallel off two scanners
>(stereo image?),  and you don't want to wait for all of the first one
>before you can make progress on the second, then you are in the business
>of interleaving them.  That's going to be important for some use cases
>someday, but I was making the suggestion that interleaving might not make
>an 80/20 cut for a SOAP binding in the next few months.
>
> >If you
> > tell me enough information ahead of the bits, then
> > either I can accept your TCP/IP packets or refuse them.
> > Given that we are in HTTP these are the only two things
> > I can do right?  I'd rather read something like:
>
>I think it depends on the level you're thinking about.  At some level, 
all
>of TCP/IP streams (in the sense I mean) because it comes in one packet at
>a time, and you can always try to finish with one before accepting (or
>sending) the next.  The question is whether that's realistic at the next
>level up.  To be perfectly rigorous, you can't for example process the
>start of a SOAP envelope without seeing the end, because you don't even
>know whether it's well-formed until you see the end tag for
></soap:envelope>.  If that doesn't show up in the right place, you've got
>no Infoset, and no Envelope, therefore "no SOAP" (pun intended.)  XML
>doesn't stream, in this sense, and SOAP uses XML (modulo the permission 
to
>use optimistic concurrency and roll back all side effects once you
>discover that the envelope is poorly formed.)  Of course, many
>implementations will start work early, and will indeed roll back when the
>message proves to be not well formed.  Still, I think you'd be making a
>mistake to do a database commit based on a SOAP message until you'd seen
>the end tags.
>
>Similarly, if I want SOAP to be robust enough to make progress on 2 or 3
>large streaming attachments to the same message in parallel, then I can't
>just argue at the IP level.  I've got to look to Multipart MIME, DIME, or
>some level that will allow me to express the interleaving of those
>streams.  I think that's a very important use case for someday, but I'm
>proposing we not "go there" for now.
>
> >
> >
> > ______________________________________________________
> > John J. Barton          email:  John_Barton@hpl.hp.com
> > http://www.hpl.hp.com/personal/John_Barton/index.htm
> > MS 1U-17  Hewlett-Packard Labs
> > 1501 Page Mill Road              phone: (650)-236-2888
> > Palo Alto CA  94304-1126         FAX:   (650)-857-5100
>
>Thanks for your patience.  Hope this is helpful.
>
>Noah
>
>------------------------------------------------------------------
>Noah Mendelsohn                              Voice: 1-617-693-4036
>IBM Corporation                                Fax: 1-617-693-8676
>One Rogers Street
>Cambridge, MA 02142
>------------------------------------------------------------------

______________________________________________________
John J. Barton          email:  John_Barton@hpl.hp.com
http://www.hpl.hp.com/personal/John_Barton/index.htm
MS 1U-17  Hewlett-Packard Labs
1501 Page Mill Road              phone: (650)-236-2888
Palo Alto CA  94304-1126         FAX:   (650)-857-5100
Received on Monday, 13 October 2003 14:11:23 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Monday, 7 December 2009 10:59:15 GMT