Genericity, strong typing, SOAP-RPC, GRASP, and defining the Web

A couple of the topics being discussed lately --- "Defining the Web" and
"SOAP breaks HTTP?" --- touch on some issues that I've been researching
and contemplating for most of the last year.  Convinced and inspired by
Roy's and Mark's pro REST arguments last summer, I've since been trying
to come to grips with what qualities of the Web give rise to its
desirable qualities and how that happens.  Publishable results of that
research aren't nearly ready for prime time, but I thought it might be
useful to offer a few thoughts, hypotheses, and observations.

Genericity and Compositional Complexity
------------------------------------------

NB:  "compositional style" and "architectural style" are often used
interchangeably by researchers in e.g. component architectures, and it's
instructive to consider architectural styles ala REST and the Web as
defining a "compositional algebra" for interaction with (and --- maybe
eventually --- among) resources.

The thing that most convinced me of the "rightness" of the REST POV (or
more appropriately, the "wrongness" of SOAP-RPC over HTTP) was a
hypothesis about the compositional complexity of systems of objects with
generic vs. type specific interfaces.  I've suggested elsewhere that the
complexity of systems of interacting objects which have generic
interfaces is proportional to O(N) in the number of objects involved,
while for systems with type specific interfaces it is proportional to
O(N^2).  This probably isn't accurate --- it's very difficult to define
and quantify what we're really measuring in that kind of a statement.
Digging into this topic over the last several months has lead to some
pretty weird (alt., "fascinating" ;-) questions about "what is an
interface," etc., and has to date resulted in an incomplete attempt to
define and compare formal compositional algebras for different
compositional / architectural styles.  However, it should be intuitive
that compositional complexity of a system of different components in
general rises non-linearly with some notion of the complexity of all
distinct single component interfaces.

If this (or similar) hypothesis about compositional complexity is true
then any definition of the Web and any further elaboration of Web
architecture should explicitly capture the notion of interface
genericity for any given class ("type") of resources as central to the
Web.

Strong Typing
--------------

A thorough discussion of compositional complexity requires getting into
some pretty tricky discussions of what constitutes a type, an operation,
an interface, etc.  It may however be instructive in this discussion to
introduce an informal notion of what constitutes a type:  a type is a
objects of objects which support a given operation (interface) or set of
operations (interfaces.)

The Web is already subtly but strongly typed!  It seems reasonable to
consider a URI protocol scheme as declaring the type of the object being
referred to;  protocol definitions describe the interface to that type
and application protocol implementations therefore implement the type's
interface(s).  HTTP (http://... addressed) resources belong to an Http
type, FTP resources are of type Ftp, etc.  In practice it's a bit
stickier than that for three reasons:  (1) HTTP (or any protocol where
all methods are not mandatory on all associated resources) is really an
arbitrary family of generalized / specialized types defined by the
permutations of methods supported by any given resource; (2) more
stateful protocols such as FTP, SMTP, etc. are more difficult to reason
about; and (3) the relationship between resources and representations
-wrt- typing is rather vague, though Jeff Mogul's recent paper is a
great start at helping clarify and define that relationship.  In
general, though, it's useful to consider URI schemes as defining
distinct types (or type families) of objects.

SOAP-RPC
-------------

SOAP-RPC breaks HTTP / the Web because it breaks the strong typing
implied by HTTP.  It's analogous to taking an object of a given type and
somehow casting it to a completely different type.  The Web (in theory
--- particularly intermediaries) works because different components can
count on a consistent interface with consistent semantics for all
objects of type Http.  The motivation for SOAP-RPC seems reasonable at
first glance:  the typing system of HTTP is so general (particularly
-wrt- representations) and coarse grained that it's not useful for
building traditional fine-grained, tightly coupled objects and types;
it's not clearly useful in modeling applications domains the way we do
today.  However this ignores the fact that there's "power in large
values" and that going to smaller, fine-grained interfaces with more
specific types produces tighter coupling and non-linear growth of
compositional complexity.

REST and / vs. GRASP
------------------------

It's been said that "hypermedia is the engine of application state" for
the Web.  Graphs may play the same central paradigmatic role in building
Web software that e.g. arrays play in APL, lists play in Lisp, text
streams play in UNIX, etc.  We've had some heated discussions elsewhere
on how central "hypermedia" is or should be to REST and / or to the Web
in general;  regardless, it seems useful to distinguish between the
values and qualities that the Web derives from URI, from HTTP or other
generic applications protocols, and from the data models that are
communicated / exposed / implied by any application protocol.  Perhaps
this implied structural and processing model --- call it "GRASP," or
GRAph Structure and Processing --- is in some sense orthogonal to the
considerations of naming and generic interfaces inherent in a purely
abstract notion of representational state transfer.  (I'll leave it to
Roy to decide the definitional question. :-)

It's clear that any attempt to define the Web *needs* to consider
whether such an explicit paradigm is required;  does the Web in fact
have an underlying data model separate and distinct from the "typing"
implied by different protocol schemes in URI?  I would suggest that it
is important *not* to attempt to address this in the definition of the
Web;  it's an application protocol issue, if that.

Defining the Web
------------------

IMO, URI are the central abstraction of the Web.  The subtle but
important implication of this is that URI protocol schemes defines
families of types, and imply generic consistency of interfaces across
all resources in a given family.  Another implication is that the Web
must conceptually encompass all such URI types which define such generic
interfaces.  Any definition of the Web should work at this level of
abstraction;  it should avoid reference to any particular protocol
(HTTP) or representational structure / abstraction (hypertext, graphs)
while excluding systems and protocol schemes for which generic
consistency of interfaces is non-existent.  Hence, as mentioned before,
I think Mark Baker's definition (with minor tweaks, note v2 to get rid
of icky word "manipulable") does the best job of any that have been
offered at capturing the grand vision of the Web:

   "The World Wide Web ("Web") is a networked information space which
   encompasses all things with identity ("resources").  Resources are
accessed
   and manipulated via generic interfaces whose semantics are applicable
to all
   resources in a given address scheme and application protocol."

----

Researching and thinking about these topics over the last year has been
a tremendously rewarding intellectual endeavor for me, so thanks to
those of you who inspired me down these paths. :-)  I don't know that
I've got any really strong or novel *conclusions* in any of this, but
hopefully some of these half-baked intuitions and opinions might be
useful or at least evocative for somebody.  ;-)

jb

Received on Tuesday, 26 March 2002 18:11:29 UTC