Re: Binding

Mark,

Sure, you can invoke HTTP GET on any URI, but that matters little. If the 
URI is not
an HTTP scheme URI, then the software needs some further clue as to where 
to dispatch the
request. If the URI is JUST an identifier (as in the case of an HTTP 
scheme namespace URI with nothing
at the origin server), then you may get a 404 which tells you nothing 
other than that there was
no one home. It doesn't tell you if the URI was incorrect, or that it is 
an identifier with no representation
of the "resource" it identifies. Has the resource been relocated and its 
controling authority just
lax in saying so by means of a 307? Does the resource even exist? What is 
the resource if it exists
but there exists no representation for it?

The fact that you can invoke HTTP GET on any URI does not mean that you'll 
have the first
clue as to what to make of the representation you receive in response. It 
could have a media type
of application/octet-stream. What are you supposed to make of that?

A browser, spider, crawler, etc. DOES have a priori knowledge about the 
media types of the anticipated representations
that it might receive in response to invoking a GET on some URI typed into 
the browser. Even the
browsers that have built-in capabilities of upgrading themselves on the 
fly by downloading and installing
plugins to deal with media types that they had previously not been 
configured to handle are
limited to dealing with the application domain of the browser itself which 
is to render (mostly) or at best
dispatch to some preconfigured external handler.

Sure, pluggable/portable code is a nice feature, but across trust 
boundaries, is of limited
value unless you are overly cavalier about your system's security. 

Okay, so let us assume for a moment that what is returned is 
application/xml or some RFC3023 derivative
thereof. Sure, you can parse the received pointy brackets assuming you 
have an XML parser built-in. You
might even be able to validate the pointy brackets against a schema that 
had not been previously known. 
So what? Does that give you a clue what to do with the pointy brackets or 
what the bits between 
the pointy brackets is supposed to mean?

Nope.

With today's browsers (the predominant client of the Web), some poor 
underappreciated programmers
spent tedious hours with the HTML and other media type specs in front of 
them as they wrote the software that would
eventually process the entity bodies that were arbitrarily returned on 
HTTP GET requests to arbitrary
URIs. They had to make conscious decisions as to which of the media types 
they would incorporate
into their browser software. In short, many person years of a priori 
coordination (software development)
have been poured into wiring the A PRIORI knowledge needed to make the 
browser SEEM as if it were in no 
need of a priori knowledge of what might be at the other end of a URI. And 
the browser software also (typically) has a 
default "should I save this to disk because I have no idea what to make of 
it?" prompt that it can ask the user (human) 
when the media type encountered was not in the set that the developers had 
prepared the browser to handle.

The fact is that someone had to encode some knowledge and interpretation 
of the representations at some 
point in time. It has largely been the case that there have been 
relatively few media types and these
have had generic or standalone handlers (for the most part) written for 
them such that the browser can simply 
dispatch the handling/processing of the retrieved representation to the 
registered application for the media 
type of the entity body of the HTTP response message. These handlers are 
not arbitrarily integrated
with disparate back-end software as is likely the typical case for such a 
handler in the Web services space. Rather,
they have been of the standalone variety as is the case for dispatching an 
application/vnd.ms-excel to
Microsoft's Excel spreadsheet application.

There has been a priori coordination... trust me. Just because it has been 
fairly invisible, or has occurred gradually
over time doesn't make it any less real or any less required.

Now we are attempting to open up the space to orders of magnitude more 
"types" than we have been
dealing with to date. Over time, we can only hope that these will become 
fewer and more standardized.
However, that standardization will take considerable time and effort. The 
problem is very different than it
was when we standardized on HTML because we have roughly 20-30 years of 
previously deployed and
entrenched systems, implemented by thousands of different vendors and/or 
enterprises that each have their 
own similar, but often incompatible notion of how their data is 
represented and what it means, yet we are highly 
motivated to get these entrenched systems to talk with one another, across 
trust boudaries in many cases.

Beyond that, we're attempting to move beyond HTML forms, which carry all 
of their semantics in the natural
language and prose which surrounds and is embedded with the <INPUT/> 
elements and that requires (typically)
human intelegence to decipher, to something that can be more readily 
processed by automata that
have been programmed to a specific purpose that may vary widely from one 
deployed instance to another.
We can no longer rely on preconfigured availability of standardized and 
standalone content handlers to
which we can dispatch the entity body of arbitrarily retrieved resource 
representations. We need some means
of being able to convey/describe the details of the complete interface 
(beyond the fact that "HTTP GET someuri"
will (likely) return a bag of octets as is described in RFC2616). We need 
a description that includes at a minimum
the types (and hopefully some hint as to the semantics of those types) of 
messages that are exchanged.

This brave new world is not one that REST *alone* prepares us for. Without 
a doubt, there is significant
value in the architectural constraints defined by REST. I have little 
doubt that from a *runtime* perspective, that
there will be significant value add for applications that adopt this 
architectural style in the long run.

However, REST does not aide in the ability for one to deploy a service 
that has a prayer of being
used by a consumer that has not been written by the author of the service. 
We still have need of design-time 
aides to enable independent and interoperable authoring of consumers and 
providers of service and/or 
resource representation. For that, I am afraid that some manner of a 
priori coordination is a requirement, 
especially given the level of sophistication of both our software and 
development resources. We need 
some standardized manner of conveying the fact that 

        HTTP GET http://www.markbaker.ca/9ajp23q9rj89aweruwer 

will return the current share price of IBM's stock in an XML (one would 
assume because that is what it 
looks like although the content-type is given as text/plain) 
representation that is apparently not defined by 
any schema or DTD and that does not belong to any namespace but probably 
looks something like this
when you do an HTTP GET on it:

<stockquote>
  <company>http://www.ibm.com/</company>
  <value>xx.yy</value>
  <kind>http://stockstandards.org/types/realtime</kind>
  <time>[assume the current time is here]</time>
</stockquote>

Many if not most interesting applications will want to know this sort of 
information BEFORE they willy nilly 
invoke the HTTP GET because the only reason that they would ever do so is 
to get IBM's current share price. 
Many systems will have need of this (or at least some of this) information 
before they are even written.
(your mission, should you accept it, is to write an application that 
queries IBM's current share price from the apparent
authority on the matter, Mark Baker. No other information is available 
beyond the fact that we found this URI
on the side of a bus in the Greyhound lot in Park Square.This tape will 
self destruct in 5 seconds....)

Or perhaps, had you similar heritage as I, the representation might 
resemble this:

<AufLageranführungsstrich>
  <Firma>http://www.ibm.com/</Firma>
  <Wert>80.20533</Wert>
  <Freundlich>http://stockstandards.org/types/realtime</Freundlich>
  <ActuelleUhrzeit>Januar 6, 2003 22 Stunden</ActuelleUhrzeit>
</AufLageranführungsstrich>

In which case you might be scratching your head for a while trying to 
figure out which way was up.
Of course, given that an HTTP GET on 
http://stockstandards.org/types/realtime returns a 404 leads me to wonder
whether I really know what the devil Freundlich this set of pointy 
brackets really is in the first place. Doing
an HTTP GET on http://www.ibm.com/ automatically redirects me to 
http://www.ibm.com/us/ which has me very 
confused. Is IBM now just a U.S. enterprise? Was I redirected there 
because my browser preferences indicate
that my preferred language is en-us or is this stock price for a Web page? 
Was I supposed to be doing an 
HTTP GET on these URI? How did I know they were URI to begin with?

And the wert seems to conflict with what the IBM closing price was as 
listed on my Yahoo Web page... 
Gee, I wonder why that is? Boy, this IS an impossible mission!

Possibly someday we will have inference engines that can reason for, and 
reprogram, themselves to adapt to 
arbitrary semantics that are retrieved from HTTP GETs on arbitrary URIs 
scraped off of the side of a bus or
a billboard. Perhaps someday we will have enough deployed metadata that 
can be used to effect that 
reasoning such that we may have little need of any a priori coordination. 
Perhaps we will have RDF graphs
up the wazoo that we can leverage to give us some clue as to what resource 
is identified by 
http://www.markbaker.ca/9ajp23q9rj89aweruwer 

I think that that future is a long, long ways off. 

Maybe one day further out in that future, the software will even perform 
with the efficiency and accuracy 
required to make such systems a realistic, viable, and cost effective 
substitute for the drones to which we are limited 
today.

Cheers,

Christopher Ferris
Architect, Emerging e-business Industry Architecture
email: chrisfer@us.ibm.com
phone: +1 508 234 3624

Mark Baker wrote on 01/06/2003 04:15:01 PM:

> 
> On Mon, Jan 06, 2003 at 12:35:08PM -0500, Geoff Arnold wrote:
> > To avoid the apples vs. oranges problem, we need to start with the 
same 
> > initial
> > conditions and end up with the same final conditions.
> 
> Absolutely.
> 
> > It is *not*
> > legitimate to assert that the client possesses different information 
in
> > the REST and non-REST cases, which is what Mark seems to be doing.
> 
> The client *DOES* possess different information.  In addition to
> knowing the structure of the request, it also possesses the knowledge
> that it can invoke the GET method on any URI, the same way that somebody
> seeing an nfs:// URI knows they can invoke READ on it, or an FTP URI can
> have RETR invoked on it (of course, they can invoke *any* method of the
> associated application protocol, but "retrieve"-like methods are the
> obvious one to mention in an example).
> 
> And not to suggest that FTP and NFS are all REST systems; they're not. 
> But they all associate an application protocol with an identifier, and
> then export that identifer into URI space by associating the
> protocol (or more generally its coordination semantics) with the URI
> scheme.
> 
> Please(!), think about that for a sec.
> 
> > At the conclusion of the interaction (either RESTfully or 
> > non-RESTfully),
> > we have exactly the same postconditions in both cases.
> > The client has all of its original information, plus the share
> > price of "IBM". The server has all of its original information, plus
> > (if it cares) the fact that the client has been provided with the
> > information.
> 
> Right.
> 
> > Over to you, Mark. Or not.
> 
> And now for something completely different. 8-)
> 
> MB
> -- 
> Mark Baker.   Ottawa, Ontario, CANADA.        http://www.markbaker.ca
> Web architecture consulting, technical reports, evaluation & analysis
> 

Received on Monday, 6 January 2003 23:44:27 UTC