Re: SOAP breaks HTTP? from Paul Prescod on 2002-03-27 (www-tag@w3.org from March 2002)

From: Paul Prescod <paul@prescod.net>
Date: Tue, 26 Mar 2002 16:42:56 -0800
To: www-tag@w3.org
Message-ID: <3CA11590.502161B3@prescod.net>
Joshua Allen wrote:
> 
>...
> 
> There is more than that -- does the POST actually go to the same URI?
> Why do you POST against URI "y" to modify resource "x"?  If I can GET a
> resource, and I own it, I should also be able to PUT it. 

Theoretical argument: What does it mean to own a resource on Expedia's
site? 

Practical argument: HTML doesn't support PUT (which is, IMO, a weakness)
so practically speaking GET and POST are all you get. But GET is 90% of
the benefit of the Web because with a standardized GET method you get
addressing. This in turn gets you your metadata triples. There's a
reason I wrote an essay specifically on using RPC for GET. I really
don't care very much if it is used for PUT, POST, DELETE etc., except at
a philisophical level.

But the benefits of a single, unified GET method are quite obvious.

> ...
> SOAP is about sending a message to an endpoint, with no semantics about
> whether that message will invoke RPC, change the resource associated
> with that endpoint, etc.

How can you say that there are no semantics:

 * http://www.w3.org/TR/soap12-part2/#encrules
 * http://www.w3.org/TR/soap12-part2/#soapforrpc

> > For the types of things where we have to choose SOAP-RPC versus REST,
> > there will be no "URL bar" and the "peeping toms" will be
> intermediaries
> > that can snoop on the body as easily as the URL.
> 
> OK, I think you missed my point -- the point is that *most* sites do
> things like assign temporary session keys to their users, and the
> "resources" being represented are dependent on that "session" key.

I have URIs for Amazon purchases I've made in the past, for Expedia
purchases, for Slashdot posts, for Google searches and for Wiki
contributions I've made. Those four services have spent effort moving
information from a private namespace into the public one, and this is
the essence of Webification. It is because of those URIs that these
services are more useful to me than their pre-web equivalents (e.g. on
Compuserve).

To be honest I can't see how PUT would improve it much (though I like it
at a philisophical level and for cache invalidation).

In some cases they could have exposed even more state through URIs --
great, let's go encourage them to do so. I like your ideas for improving
URI-usage and I have some of my own.

By the way, session keys are not inherently un-restful. A so-called
session key is just a part of the URI that identifies a particular
transaction. You cannot and should not do transactions through a GET of
massively long URIs. Once a transaction changes server state (e.g. to
tenatively book a flight using a credit card number) it needs to be
distinguished from every other transaction so it needs its own set of
URIs. These will typically contain a session key rather than being
unrelated to each other.

> Whatever token is embedded in the URL or cookie is used to determine
> user context and manage the "sensitive" data on the other side.  All
> access to resources is controlled through business logic and
> user-specific context.

There is nothing wrong with controlling access to resources through
business logic. I certainly expect that! Some level of user-specific
context is also necessary to establish authorization. In the expedia
example, this is the *only* user-specific context that is necessary. I
know that because I can move that URI between browsers and the only
state that needs to travel with it is my username and password. With
this URI you don't even need that:

http://www.expedia.ca/pub/agent.dll?qscr=fexp&qsfr=fltw&city1=Vancouver%2C+BC%2C+Canada+%28YVR%2DIntl%2E%29&citd1=Orlando%2C+FL%2C+USA&cAdu=1&date1=16/4/02&time1=361&date2=18/04/02&time2=1081&trpt=2&flag=q&dfmt=1&qryt=8&rfrr=-429

But there are certainly parts of the Expedia service that violate
principles of REST. I don't see the relevance to SOAP.

>...
> And there *is* a difference -- please don't tell me that "exposing
> functionality is just an example of REST, since the return value is a
> representation of a transient state."  I think a "resource" needs to
> meet a higher bar than just "you can do a GET on it", because then a
> resource is not distinguishable from an RPC request.

I don't follow you at all. Anything that you can do a GET on is by
definition a Web resource. Some sites (including, in some cases,
Expedia) hide multiple logical resources behind a single URI. You and I
agree that this is bad. I think you would also agree that this is what
the vast majority of deployed SOAP-based web services do. Perhaps every
single one of them. You seem to think that it is better that we now have
a standard way of doing this. AFAICS, this has something to do with PUT.

> ...  And in fact, this
> is why I think we have a problem today -- most people use GET and POST
> with things like PHP, ASP, etc. as an RPC.  They tell themselves,
> "everything I have done here with GET and POST I could just as easily do
> with RPC".  Precious few people even bother to make their dynamic pages
> addressable through bookmarks, let alone metadata.  And this is not even
> talking about other ways that you *should* be able to use a "resource".

We have very different senses of this issue. The four services I
mention, for the most part, generates unique URIs for the things I want
to reference. They could do better but they overall do a good
job....much better than any SOAP-based service I have ever seen.

> > No argument here. We need to improve education and re-evaluate what in
> > the infrastructure is leading people awry.
> 
> Well, for starters ....

I like all of your ideas. The three killer apps for useful URIs today
are bookmarks, email and hypertext links from web pages. There could and
should be more. I strongly belive that once people grok REST web
services they will be much more diligent than they are with web pages.
If I can't reference an intermediate state of an Expedia transaction
then I just start again. But if this has accounting implications in the
web services world then people will be Very Careful to generate uniquely
identifiable and digitally signable representations and URIs.

>...
> Well, it makes it less likely that people will use the GET querystring
> overloading to pass RPC-like parameters. In other words, I want to have
> some degree of confidence that when I GET a resource, I can also PUT it
> (given appropriate permissions).  

Why??? This is entirely backwards from my thinking. Except for the rare
GET that mutates the server I've NEVER MET A GET URI that I didn't like.
The more of them the better. If I had to define a crude measure of the
Web-i-ness of a server I would ask how many distinct resources I could
GET compared to competitors.

PUT is at best a minor issue. Why should I care that I can't PUT an
itinerary to Expedia's site? How does it hurt me that I have to POST it?
As a wanna-be-purist I'd love to hear a reason I should be hard-assed on
this issue but really other than cache invalidation I really don't see
much of an issue. POST is close enough.

> .... If people stop overloading the
> querystring with RPC parameters, I improve my chances.  Maybe there are
> others...

I would scream out loud if someone told me they moved a resource
something from HTTP GET to SOAP because it "didn't support PUT." That
makes no sense to me and it would greatly impoverish the Web.
http://www.IBM.com/ doesn't support PUT (by me, anyhow). It's still
pretty damn useful to me! I don't even know what you mean by
"overloading the query string". As long as the query string returns a
unique resource then there's nothing wrong with it. You start getting
into RPC when you *stop* using the query string and instead use magic
cues like cookies or SOAP parameters to figure out where you are in a
transaction.

> > SOAP demonstrably discourages people from RESTful architectures in
> that
> > a SOAP (e.g.) getStockQuote is NOT REST-ful and one cannot do a
> REST-ful
> > GET of a STOCK QUOTE without abandoning SOAP. I'm still waiting for an
> 
> Is it the stockQuote example that bothers you?  

I am bothered every time I see any form of RPC used for "get":

http://www.prescod.net/rest/rpc_for_get.html

> ... I have no strong opinion
> about whether that should be a resource GET or an RPC call. 

If you read the URI above and still have no opinion then I would like to
hear your arguments for the advantage of the RPC version. 

> ... But the
> point is, GET of a resource makes sense for some things, and in other
> cases, developers will insist on hiding all of their resources behind an
> endpoint that exposes *functionality*.  Maybe you can argue that some
> places where a person uses RPC or Message-passing should really use GET.
> But you cannot argue that *all* message-passing architectures should use
> GET.  And users wouldn't listen anyway.

The W3C and TAG are responsible for one and only one architecture. And
as you can see that I have nothing against session identifiers I really
don't see how REST conflicts with the goal of hiding functionality
behind a URI. As long as you generate many URIs representing the steps
of a transaction then the client can always get back at the transaction
state (hidden though it is) through URIs. For instance, BabelFish is a
service that would benefit from session IDs. I just did a translation
there but they neither generated a query URI nor gave me back a session
ID. So I have no way of referring to it, even for a single hour. A
session ID that rotted away after a day would have been better.

REST just requires that the state be EITHER explicit in the URI *or*
explicit in a new resource with a new URI (call it a "session URI" or
"transaction URI"). This should include the "state" of where in the
transaction we are. That's the bit that Expedia sometimes falls down on.

> In other words, being able to pass a message to a URI and allowing the
> URI to opaquely act upon that message *is* a legitimate use case.  You
> cannot ignore this use case.

Please present the use case in the terms of a real-world problem. You're
presuming a particular solution to the problem before I even know what
the problem is. Is my session ID solution sufficient?

> > example of a SOAP web service that uses URIs properly. Pointing to
> flaws
> > elsewhere is really not, in my opinion, relevant.
> 
> Well, can you point me to an example of a POST web service that uses
> URIs properly?  Then take that example and just use SOAP as the envelope
> when POSTing.  

Great. And what does the SOAP envelope buy me in that example? According
to Don Box (and many other people), the payload is better described
through XML Schema than through "section 5" so SOAP isn't usually giving
me the encoding.

But more important, how do I do the inverse of the POST? How do I GET,
PUT or DELETE information WITH SOAP without violating the web axioms
(with the emphasis on GET, of course)? Do I fall back to HTTP? If so,
I'll ask again what SOAP is contributing? Are we really going to tell
people that they should use Microsoft's snazzy tools for doign POSTs but
should fall back to HTTP modules for GET?

By the way, WSDL is as bad as SOAP in this regard. Once I've generated a
new URI in a transaction, WSDL has no way for me to describe the
WSDL-interface of that URI. In this way, WSDL works against web
architecture by forcing me to use a component-model interface rather
than a resource interface.

> ... There are quite likely web services in production that
> meet that criteria 

Nobody has been able to name one today.

> ... (although I dunno, since it depends on what you mean
> by "uses URIs properly).

If you turn off cookies in your browser, Slashdot is a perfect example
of a bidirectional, (somewhat) transactional service that uses URIs
properly. Every view on the site has a unique URI and is connected to
other views through URIs. GET is used for GETs. POST is used for POSTs.
PUT and DELETE are not relevant because messages cannot be deleted or
updated. It goes out of its way to present URIs for views that it deems
useful. It doesn't embed userids in URIs.

It does use cookies for authentication which is an understandable flaw
given the current state of web authentication.

 Paul Prescod
Received on Tuesday, 26 March 2002 19:46:37 UTC