- From: Christopher B Ferris <chrisfer@us.ibm.com>
- Date: Mon, 6 Jan 2003 23:43:50 -0500
- To: www-ws-arch@w3.org
- Message-ID: <OF42E1551B.355C258B-ON85256CA6.0081F04A-85256CA7.0019D49F@rchland.ibm.com>
Mark, Sure, you can invoke HTTP GET on any URI, but that matters little. If the URI is not an HTTP scheme URI, then the software needs some further clue as to where to dispatch the request. If the URI is JUST an identifier (as in the case of an HTTP scheme namespace URI with nothing at the origin server), then you may get a 404 which tells you nothing other than that there was no one home. It doesn't tell you if the URI was incorrect, or that it is an identifier with no representation of the "resource" it identifies. Has the resource been relocated and its controling authority just lax in saying so by means of a 307? Does the resource even exist? What is the resource if it exists but there exists no representation for it? The fact that you can invoke HTTP GET on any URI does not mean that you'll have the first clue as to what to make of the representation you receive in response. It could have a media type of application/octet-stream. What are you supposed to make of that? A browser, spider, crawler, etc. DOES have a priori knowledge about the media types of the anticipated representations that it might receive in response to invoking a GET on some URI typed into the browser. Even the browsers that have built-in capabilities of upgrading themselves on the fly by downloading and installing plugins to deal with media types that they had previously not been configured to handle are limited to dealing with the application domain of the browser itself which is to render (mostly) or at best dispatch to some preconfigured external handler. Sure, pluggable/portable code is a nice feature, but across trust boundaries, is of limited value unless you are overly cavalier about your system's security. Okay, so let us assume for a moment that what is returned is application/xml or some RFC3023 derivative thereof. Sure, you can parse the received pointy brackets assuming you have an XML parser built-in. You might even be able to validate the pointy brackets against a schema that had not been previously known. So what? Does that give you a clue what to do with the pointy brackets or what the bits between the pointy brackets is supposed to mean? Nope. With today's browsers (the predominant client of the Web), some poor underappreciated programmers spent tedious hours with the HTML and other media type specs in front of them as they wrote the software that would eventually process the entity bodies that were arbitrarily returned on HTTP GET requests to arbitrary URIs. They had to make conscious decisions as to which of the media types they would incorporate into their browser software. In short, many person years of a priori coordination (software development) have been poured into wiring the A PRIORI knowledge needed to make the browser SEEM as if it were in no need of a priori knowledge of what might be at the other end of a URI. And the browser software also (typically) has a default "should I save this to disk because I have no idea what to make of it?" prompt that it can ask the user (human) when the media type encountered was not in the set that the developers had prepared the browser to handle. The fact is that someone had to encode some knowledge and interpretation of the representations at some point in time. It has largely been the case that there have been relatively few media types and these have had generic or standalone handlers (for the most part) written for them such that the browser can simply dispatch the handling/processing of the retrieved representation to the registered application for the media type of the entity body of the HTTP response message. These handlers are not arbitrarily integrated with disparate back-end software as is likely the typical case for such a handler in the Web services space. Rather, they have been of the standalone variety as is the case for dispatching an application/vnd.ms-excel to Microsoft's Excel spreadsheet application. There has been a priori coordination... trust me. Just because it has been fairly invisible, or has occurred gradually over time doesn't make it any less real or any less required. Now we are attempting to open up the space to orders of magnitude more "types" than we have been dealing with to date. Over time, we can only hope that these will become fewer and more standardized. However, that standardization will take considerable time and effort. The problem is very different than it was when we standardized on HTML because we have roughly 20-30 years of previously deployed and entrenched systems, implemented by thousands of different vendors and/or enterprises that each have their own similar, but often incompatible notion of how their data is represented and what it means, yet we are highly motivated to get these entrenched systems to talk with one another, across trust boudaries in many cases. Beyond that, we're attempting to move beyond HTML forms, which carry all of their semantics in the natural language and prose which surrounds and is embedded with the <INPUT/> elements and that requires (typically) human intelegence to decipher, to something that can be more readily processed by automata that have been programmed to a specific purpose that may vary widely from one deployed instance to another. We can no longer rely on preconfigured availability of standardized and standalone content handlers to which we can dispatch the entity body of arbitrarily retrieved resource representations. We need some means of being able to convey/describe the details of the complete interface (beyond the fact that "HTTP GET someuri" will (likely) return a bag of octets as is described in RFC2616). We need a description that includes at a minimum the types (and hopefully some hint as to the semantics of those types) of messages that are exchanged. This brave new world is not one that REST *alone* prepares us for. Without a doubt, there is significant value in the architectural constraints defined by REST. I have little doubt that from a *runtime* perspective, that there will be significant value add for applications that adopt this architectural style in the long run. However, REST does not aide in the ability for one to deploy a service that has a prayer of being used by a consumer that has not been written by the author of the service. We still have need of design-time aides to enable independent and interoperable authoring of consumers and providers of service and/or resource representation. For that, I am afraid that some manner of a priori coordination is a requirement, especially given the level of sophistication of both our software and development resources. We need some standardized manner of conveying the fact that HTTP GET http://www.markbaker.ca/9ajp23q9rj89aweruwer will return the current share price of IBM's stock in an XML (one would assume because that is what it looks like although the content-type is given as text/plain) representation that is apparently not defined by any schema or DTD and that does not belong to any namespace but probably looks something like this when you do an HTTP GET on it: <stockquote> <company>http://www.ibm.com/</company> <value>xx.yy</value> <kind>http://stockstandards.org/types/realtime</kind> <time>[assume the current time is here]</time> </stockquote> Many if not most interesting applications will want to know this sort of information BEFORE they willy nilly invoke the HTTP GET because the only reason that they would ever do so is to get IBM's current share price. Many systems will have need of this (or at least some of this) information before they are even written. (your mission, should you accept it, is to write an application that queries IBM's current share price from the apparent authority on the matter, Mark Baker. No other information is available beyond the fact that we found this URI on the side of a bus in the Greyhound lot in Park Square.This tape will self destruct in 5 seconds....) Or perhaps, had you similar heritage as I, the representation might resemble this: <AufLageranführungsstrich> <Firma>http://www.ibm.com/</Firma> <Wert>80.20533</Wert> <Freundlich>http://stockstandards.org/types/realtime</Freundlich> <ActuelleUhrzeit>Januar 6, 2003 22 Stunden</ActuelleUhrzeit> </AufLageranführungsstrich> In which case you might be scratching your head for a while trying to figure out which way was up. Of course, given that an HTTP GET on http://stockstandards.org/types/realtime returns a 404 leads me to wonder whether I really know what the devil Freundlich this set of pointy brackets really is in the first place. Doing an HTTP GET on http://www.ibm.com/ automatically redirects me to http://www.ibm.com/us/ which has me very confused. Is IBM now just a U.S. enterprise? Was I redirected there because my browser preferences indicate that my preferred language is en-us or is this stock price for a Web page? Was I supposed to be doing an HTTP GET on these URI? How did I know they were URI to begin with? And the wert seems to conflict with what the IBM closing price was as listed on my Yahoo Web page... Gee, I wonder why that is? Boy, this IS an impossible mission! Possibly someday we will have inference engines that can reason for, and reprogram, themselves to adapt to arbitrary semantics that are retrieved from HTTP GETs on arbitrary URIs scraped off of the side of a bus or a billboard. Perhaps someday we will have enough deployed metadata that can be used to effect that reasoning such that we may have little need of any a priori coordination. Perhaps we will have RDF graphs up the wazoo that we can leverage to give us some clue as to what resource is identified by http://www.markbaker.ca/9ajp23q9rj89aweruwer I think that that future is a long, long ways off. Maybe one day further out in that future, the software will even perform with the efficiency and accuracy required to make such systems a realistic, viable, and cost effective substitute for the drones to which we are limited today. Cheers, Christopher Ferris Architect, Emerging e-business Industry Architecture email: chrisfer@us.ibm.com phone: +1 508 234 3624 Mark Baker wrote on 01/06/2003 04:15:01 PM: > > On Mon, Jan 06, 2003 at 12:35:08PM -0500, Geoff Arnold wrote: > > To avoid the apples vs. oranges problem, we need to start with the same > > initial > > conditions and end up with the same final conditions. > > Absolutely. > > > It is *not* > > legitimate to assert that the client possesses different information in > > the REST and non-REST cases, which is what Mark seems to be doing. > > The client *DOES* possess different information. In addition to > knowing the structure of the request, it also possesses the knowledge > that it can invoke the GET method on any URI, the same way that somebody > seeing an nfs:// URI knows they can invoke READ on it, or an FTP URI can > have RETR invoked on it (of course, they can invoke *any* method of the > associated application protocol, but "retrieve"-like methods are the > obvious one to mention in an example). > > And not to suggest that FTP and NFS are all REST systems; they're not. > But they all associate an application protocol with an identifier, and > then export that identifer into URI space by associating the > protocol (or more generally its coordination semantics) with the URI > scheme. > > Please(!), think about that for a sec. > > > At the conclusion of the interaction (either RESTfully or > > non-RESTfully), > > we have exactly the same postconditions in both cases. > > The client has all of its original information, plus the share > > price of "IBM". The server has all of its original information, plus > > (if it cares) the fact that the client has been provided with the > > information. > > Right. > > > Over to you, Mark. Or not. > > And now for something completely different. 8-) > > MB > -- > Mark Baker. Ottawa, Ontario, CANADA. http://www.markbaker.ca > Web architecture consulting, technical reports, evaluation & analysis >
Received on Monday, 6 January 2003 23:44:27 UTC