- From: Jeff Bone <jbone@jump.net>
- Date: Tue, 26 Mar 2002 16:53:48 -0600
- To: www-tag@w3.org
A couple of the topics being discussed lately --- "Defining the Web" and "SOAP breaks HTTP?" --- touch on some issues that I've been researching and contemplating for most of the last year. Convinced and inspired by Roy's and Mark's pro REST arguments last summer, I've since been trying to come to grips with what qualities of the Web give rise to its desirable qualities and how that happens. Publishable results of that research aren't nearly ready for prime time, but I thought it might be useful to offer a few thoughts, hypotheses, and observations. Genericity and Compositional Complexity ------------------------------------------ NB: "compositional style" and "architectural style" are often used interchangeably by researchers in e.g. component architectures, and it's instructive to consider architectural styles ala REST and the Web as defining a "compositional algebra" for interaction with (and --- maybe eventually --- among) resources. The thing that most convinced me of the "rightness" of the REST POV (or more appropriately, the "wrongness" of SOAP-RPC over HTTP) was a hypothesis about the compositional complexity of systems of objects with generic vs. type specific interfaces. I've suggested elsewhere that the complexity of systems of interacting objects which have generic interfaces is proportional to O(N) in the number of objects involved, while for systems with type specific interfaces it is proportional to O(N^2). This probably isn't accurate --- it's very difficult to define and quantify what we're really measuring in that kind of a statement. Digging into this topic over the last several months has lead to some pretty weird (alt., "fascinating" ;-) questions about "what is an interface," etc., and has to date resulted in an incomplete attempt to define and compare formal compositional algebras for different compositional / architectural styles. However, it should be intuitive that compositional complexity of a system of different components in general rises non-linearly with some notion of the complexity of all distinct single component interfaces. If this (or similar) hypothesis about compositional complexity is true then any definition of the Web and any further elaboration of Web architecture should explicitly capture the notion of interface genericity for any given class ("type") of resources as central to the Web. Strong Typing -------------- A thorough discussion of compositional complexity requires getting into some pretty tricky discussions of what constitutes a type, an operation, an interface, etc. It may however be instructive in this discussion to introduce an informal notion of what constitutes a type: a type is a objects of objects which support a given operation (interface) or set of operations (interfaces.) The Web is already subtly but strongly typed! It seems reasonable to consider a URI protocol scheme as declaring the type of the object being referred to; protocol definitions describe the interface to that type and application protocol implementations therefore implement the type's interface(s). HTTP (http://... addressed) resources belong to an Http type, FTP resources are of type Ftp, etc. In practice it's a bit stickier than that for three reasons: (1) HTTP (or any protocol where all methods are not mandatory on all associated resources) is really an arbitrary family of generalized / specialized types defined by the permutations of methods supported by any given resource; (2) more stateful protocols such as FTP, SMTP, etc. are more difficult to reason about; and (3) the relationship between resources and representations -wrt- typing is rather vague, though Jeff Mogul's recent paper is a great start at helping clarify and define that relationship. In general, though, it's useful to consider URI schemes as defining distinct types (or type families) of objects. SOAP-RPC ------------- SOAP-RPC breaks HTTP / the Web because it breaks the strong typing implied by HTTP. It's analogous to taking an object of a given type and somehow casting it to a completely different type. The Web (in theory --- particularly intermediaries) works because different components can count on a consistent interface with consistent semantics for all objects of type Http. The motivation for SOAP-RPC seems reasonable at first glance: the typing system of HTTP is so general (particularly -wrt- representations) and coarse grained that it's not useful for building traditional fine-grained, tightly coupled objects and types; it's not clearly useful in modeling applications domains the way we do today. However this ignores the fact that there's "power in large values" and that going to smaller, fine-grained interfaces with more specific types produces tighter coupling and non-linear growth of compositional complexity. REST and / vs. GRASP ------------------------ It's been said that "hypermedia is the engine of application state" for the Web. Graphs may play the same central paradigmatic role in building Web software that e.g. arrays play in APL, lists play in Lisp, text streams play in UNIX, etc. We've had some heated discussions elsewhere on how central "hypermedia" is or should be to REST and / or to the Web in general; regardless, it seems useful to distinguish between the values and qualities that the Web derives from URI, from HTTP or other generic applications protocols, and from the data models that are communicated / exposed / implied by any application protocol. Perhaps this implied structural and processing model --- call it "GRASP," or GRAph Structure and Processing --- is in some sense orthogonal to the considerations of naming and generic interfaces inherent in a purely abstract notion of representational state transfer. (I'll leave it to Roy to decide the definitional question. :-) It's clear that any attempt to define the Web *needs* to consider whether such an explicit paradigm is required; does the Web in fact have an underlying data model separate and distinct from the "typing" implied by different protocol schemes in URI? I would suggest that it is important *not* to attempt to address this in the definition of the Web; it's an application protocol issue, if that. Defining the Web ------------------ IMO, URI are the central abstraction of the Web. The subtle but important implication of this is that URI protocol schemes defines families of types, and imply generic consistency of interfaces across all resources in a given family. Another implication is that the Web must conceptually encompass all such URI types which define such generic interfaces. Any definition of the Web should work at this level of abstraction; it should avoid reference to any particular protocol (HTTP) or representational structure / abstraction (hypertext, graphs) while excluding systems and protocol schemes for which generic consistency of interfaces is non-existent. Hence, as mentioned before, I think Mark Baker's definition (with minor tweaks, note v2 to get rid of icky word "manipulable") does the best job of any that have been offered at capturing the grand vision of the Web: "The World Wide Web ("Web") is a networked information space which encompasses all things with identity ("resources"). Resources are accessed and manipulated via generic interfaces whose semantics are applicable to all resources in a given address scheme and application protocol." ---- Researching and thinking about these topics over the last year has been a tremendously rewarding intellectual endeavor for me, so thanks to those of you who inspired me down these paths. :-) I don't know that I've got any really strong or novel *conclusions* in any of this, but hopefully some of these half-baked intuitions and opinions might be useful or at least evocative for somebody. ;-) jb
Received on Tuesday, 26 March 2002 18:11:29 UTC