One of the oft-repeated requirements stated for XML Protocol is the ability to interpose intermediaries into the message chain. This document discusses the motivations for this requirement, explores the nature of intermediaries in XML Protocol, and relates it to the requirements stated for XML Protocol as well as those which may be satisfied by XP Modules.
Application-layer protocols might be thought of as falling into one of two groups, in relation to intermediaries; those which expressly include an intermediary model, and those which have one retrofitted at a later date.
For example, SMTP[XX] uses intermediaries to provide reliable service; by storing-and-forwarding messages from the client to a local intermediary, which then takes responsibility for forwarding the message to the intermediary nominated to deliver the message to the final recipient. This allows reliable and highly-available message delivery. SMTP intermediaries also provide access into networks, by acting as gateways. These functions are tightly defined, enhancing SMTP service transparency, availability and reliability.
On the other hand, intermediaries were retrofitted into HTTP to allow the Web to scale more efficiently. Originally, the protocol required clients to contact servers directly to satisfy each request. When the Web experienced unprecidented growth, servers and the network infrastructure could not scale quickly enough to satisfy demand. As a result, HTTP/1.0[XX] introduced intermediaries (proxies and gateways), which could take advantage of locality in requests to cache the responses. Although more intermediary-related functionality was added in HTTP/1.1[XX], the continued growth of the Web sparked the development of further measures; surrogates[XX] (informally known as "reverse proxies"), often deployed in "Content Delivery Networks."[XX]
The contrast between these examples bears examination. Because an intermediary model was designed into SMTP, its intermediaries perform in a well-defined manner, and are easy to interpose into the message path. On the other hand, HTTP has ongoing problems caused by the interposition of intermediaries; intermediaries do not always have a clear understanding of message semantics[XX], and location of an appropriate intermediary is problematic[XX].
Other (non-exhaustive) examples of intermediary models designed into protocols include NTP[XX], NIS[XX] and DNS[XX], all of which use intermediaries to both scale a service and provide reliability. Network filesystems such as CODA[XX] and AFS[XX], as well as the UDP-based AAA protocol RADIUS[XX] also explicitly support intermediaries.
Conversely, FTP[XX] did not include an intermediary model, resulting in a clumsy retrofitting of intermediaries as gateways at a later date[XX]. NNTP is an interesting case, in that it was designed with a broadcast intermediary model[XX]; because of the scaling issue of this for individual nodes, caching intermediaries were later developed to offer an alternative to subscribing to a full newsfeed[XX].
Finally, SOCKS[XX] is the extreme solution for protocols which do not accommodate intermediaries - it allows the configuration of a proxy where networks need to control routing of applications. To incorporate it, the SOCKS library must be inserted as a shim between the application and the network stack, redirecting client connections to the intermediary.[XX]
Overall, experience shows that design of intermediaries into a protocol is preferable to retrofitting them at a later date. Although the most successful intermediary models tend to be in application-specific protocols (such as DNS, NTP, etc.), it is possible to do so in a transfer protocol as well. Addressing a wide range of application-level requirements whilst remaining transport-independent will be the greatest challenge for this task in XML Protocol. In this respect, HTTP may be most interesting as a model, as it has a variety of intermediary roles, it is the default transport binding for XML Protocol, and attempts have been made at introducing a processing model into HTTP already [XX].
Intermediaries are often used within protocols to allow message routing and control, as well as to obtain application performance and scaling improvements. By interposing intemediaries, an application can become more distributed, allowing the layering of services (e.g., message processing), as well as performance improvement (e.g., caching) and reliability (e.g., store-and-forward).
It is important to separate and classify these motivations, as they will determine the characteristics of a particular intermediary deployment. Often, there are multiple motivations for deploying an intermediary; for example, it may simultaneously provide a routing function whilst implementing a cache to improve performance.
The most basic function of an intermediary is routing, where the intermediary acts as a choke point for purposes of access or traffic control, logging or relay. Typically, routing intermediaries are deployed at adminstrative boundries in the network, in order to both control the traffic flowing across them, and simplify administrative tasks such as client configuration.
For example, HTTP defines the Proxy role to enable local request routing through firewalls, or to enable shaping traffic across specific links. In a different manner, DNS uses intermediaries to recursively resolve requests to the correct server, allowing aggregation of a number of requests for authoritative data and exposing a single resolution authority to clients.
Often, intermediaries may offer multiple paths to a resource, increasing the reliability of that service. Additionally, some forms of caching may increase service reliability, if authority for a resource can be transferred to the intermediary.
For example, NTP incorporates reliability through the specification of multiple intermediaries, while SMTP uses store-and-forward as a means of caching writes when the authority (MX server) is not available.
Potentially offering the richest possibilities for realising the potential of intemediaries, message processing allows intermediaries to modify messages.
For example, servers can offload processing by including procesing instructions to intermediaries in their messages. Additionally, message processing intermediaries may be able to incorporate local information unavailable at the server, to provide localized, value-added services.
While protocols often define semantics to allow limited processing by intermediaries (for such things as message caching, timestamping and routing), they generally are either very application-specific and well-defined behaviors (SMTP routing), or weak, advisory controls (HTTP caching). Recently, there has been work to retrofit a more capable processing model into HTTP [XX][XX]. Unfortunately, it faces a number of problems due to the fact that HTTP is already widely deployed.
Message processing by intermediaries that do not act on behalf of either the message sender or reciever may introduce privacy, security and integrity concerns, as they are capable of examining and modifying the message without the knowledge of either party.
Caching may be thought of as a limited form of message processing; based on message semantics, an intermediary may choose to store a copy of a message to use again.
Caching is primarily used to reduce perceived latency, flatten network bandwidth use, and provide message reliability functions.
For example, DNS uses caching to take advantage of temporal and network locality in requests, enabling a heirarchy of servers to distribute load and provide a worldwide, highly reliable service based on a very small number of authoritative servers.
Typically, techniques such as validation, invalidation, and freshness controls are used to indicate how a particular message should be cached. Caching requires knowledge of the application and integration into the message semantics to determine what is appropriate to cache.
For example, both DNS and HTTP define message freshness semantics for use by clients, including intermediaries. HTTP also provides validation techniques. Invalidation techniques for HTTP are currently under consideration [XX].
XML Protocol is a general, rather than application-specific protocol; it is being designed to carry a variety of payloads for applications with differing characteristics and requirements. In this aspect, it joins other "transfer" protocols such as FTP, SMTP and HTTP. Such protocols are somewhat more challenging to design by nature, both because of their scope and the lack of a tight binding between the application semantics and the protocol. In effect, such transfer protocols are layered underneath the application, adding further abtraction.
Because the nature of XML Protocol payloads is so diverse, and specific application semantics are necessarily opaque to the protocol, there must be an extensible intermediary model that allows new functionality to be targeted specifically at a particular intermediary. This is achieved through the 'actor' attribute on XP Modules.
XML Protocol is also somewhat unique in that it is an explicitly layered solution, and may either be an application-layer protocol in itself, or may be used in conjunction with another transfer protocol. For example, the default binding is HTTP;
... -> TCP (transport) -> HTTP (application / transfer) ->
XP (application)
while a pure TCP binding has also been discussed;
... -> TCP (transport) -> XP
One of the main goals of XML Protocol is to introduce a processing model to messages; that is, define rules for how XML Protocol Processors should handle messages and message extensions. This enables the protocol to be extended naturally, and also allows intermediaries to easily interpose high-level services onto messages.
XML Protocol introduces the 'actor' attribute to allow messages to request (or require) the application of services by an intermediary. In addition to this explicit model, services may be applied by the intermediary itself, without explicit targeting in the message.
Intermediaries have a number of characteristics which define the nature of their deployment. While not an exhaustive listing, these design dimensions attept to capture the range of salient characteristics of an XML Protocol intermediary deployment.
On its own, "intermediary" is a vague term; because both the OSI stack[XX] and XML Protocol take a layered approach, there are several places in which an intermediary may be usefully interposed. In the XML Protocol Requirements document[XX], there is already reference to "transport intermediaries" - referring to XP transport layer (which may correspont to either the transport or application layer in the OSI stack) intermediaries used only for routing, and "processing intermediaries", which act as message processors.
For example, an HTTP Proxy deployed in a private network to provide access to the wider Internet is a transport intermediary; it exists only as a routing mechanism. On the other hand, an intermediary which timestamps messages on the way out of the network needs to understand and process the XML Protocol message, and is a processing intermediary.
For the purposes of XML Protocol, it may be most useful to disregard the extremes; exclusively low-level (such as physical and network transport) and high-level (such as business logic) intermediaries do not add substantial meaning to the XML Protocol model.
It should be noted that often, intermediaries at different layers coincide in the same device. For example, a proxy may be deployed as an XP transport-layer routing mechanism (e.g., to get through a firewall), while it also acts as a processing intermediary to provide value-added services (e.g., to digitally sign all outgoing messages).
Intermediaries are usually deployed on behalf of one of three parties; the end-user, the access provider, or the content provider. For example, an ISP may deploy a proxy to flatten bandwidth, in which case it is the sponsor. However, if the same ISP deploys intermediaries and contracts with content providers to provide services through them, the binding nature of the contract makes the content providers the sponsors.
The sponsor of an intermediary is the largest influence over what kind of services it will execute. For example, an intermediary deployed on behalf of a service provider will probably not be trusted by either the end user or the content provider. However, this may change if a robust trust model can be developed for such services [XX].
Additionally, work is underway which may enable sponsorship to be determined on the granularity of an intermediary service, rather than a device; see the proposed OPES[XX] and CDI[XX] IETF working groups.
Messages must be routed to an intermediary in some fashion, in order for it to be interposed into the message path. For most addressable services, there are three ways to accomplish this;
XML Protocol requires that intermediaries be targetable, so that services may be explicitly activated. Proxies are easily targetable, as they are known to the client. For an application to address a surrogate, the sender needs to be given information to allow it to target the intermediary. Because interception intermediaries are seldom known to the sender or reciever of a message, they are not usually addressable.
If intermediaries are to be fully functional, they will need to be able to be used as transport binding gateways. That is, it should be possible for an intermediary to accept a message using one protocol binding (for example, HTTP) and forward it with another (for example, SMTP).
One of the natural evolutions of XML Protocol capabilities will be to allows XML Protocol Modules access into the transport binding, to add such functionality as Quality of Service, encryption services, and routing functions.
For these functionalities to coexist, there must be some way to map transport-related services in a manner independent from the transport binding, so that the services may cross these boundries.
The ability to target XML Protocol Modules to specific intermediaries brings about the need to find a way to nominate them. Additionally, the status and error reporting requirements need a mechanism to identify the intermediary which generates such a message. There is no URI scheme specified for identifying an intermediary; schemes such as HTTP are meant to identify endpoints.
The HTTP does provide for the identification of intermediaries in the Via header, but does not tightly specify a naming convention. As a result, definition of a new URI scheme may be required to accommodate intermediaries, depending one whether or not it is judged important to distinguish them from protocol endpoints.
Additionally, indirect means of nominiating intemediaries may be useful. SOAP defines a relative URI for addressing the 'next-hop' intermediary. Along with additional realative URIs, class-based nomination of intermediaries may prove valueable.
For example, it may be desireable to target an XML Protocol Module (such as a caching module) at a particular vendor's intermediary products, because the message sender knows how that product will handle the processing instructions correctly. A mechanism to allow message senders to richly identify intermediaries by class would enable this functionality.
Although intermediaries are explicitly defined and accommodated by XML Protocol, they can only be functional if there are application semantics defined to take advantage of them. XML Protocol's Modules offer an excellent opportunity to standardize common intermediary functions.
The determination of where an intermediary should forward a message can be affected by a number of factors. It may be determined at the transport layer, either by resolving the service URI (if the intermediary is a proxy or an interception device) or through device configuration (if it is a surrogate). Conversely, it may be determined in the application layer, either by explicit message semantics or intepretation of the message semantics, combined with device configuration.
Routing considerations have a great influence on intermediary deployment characteristics as well as the nature of the services which they provide. Routing also can be considered an intermediary service itself, either as simple forwarding or through use of more complex patterns such as application-layer multicast, recursion, message aggregation and gossip protocols.
Some XML Protocol applications may wish to make caching possible for latency, bandwidth use or other gains in efficiency. To enable this, it should be possible to assign cacheability in a variety of circumstances.
For example, "read" caching might be used to store messages at intermediaries for reuse in the response phase of the request/response message exchange pattern. Such caching might be on the scope of an entire message, an XML Protocol module, or scoped to individual XML Protocol module elements, including body elements.
Similarly, "write" caching may be useful in situations when a request message in a request/response message exchange patterns (as well as similar messages in other message exchange patterns) does not need to be immediately forwarded or responded to. Such cachability might be scoped by different methods, as outlined above.
Cacheability scoped by different elements might be associated by an attribute to the target element, through use of XML Query or XPath to describe the target elements in a header, or implied by the document schema, for example.
Cacheability mechanisms applied to messages, bodies or elements might include time-to-live (delta time), expiry (absolute time), entity validation, temporal validation, subscription to invalidation services, and object update/purge.
Finally, some applications may be capable of describing the dependencies and relationships between message elements. For example, a response element may be applicable to a wide range of requests; it would be beneficial to describe this element's relationship with request elements, so that it may satisfy a wide range of requests in an economical fashion. Similarly, the presence of a particular element may be a trigger for a cacheability mechanism to be applied to another element, such as validation or invalidation.
The possibility of a processing interception intermediary - one that is interposed without knowledge of the content provider or end user -- introduces a number of possible problems to XML Protocol applications.
The most obvious danger is the compromise of message privacy; for example, financial or other highly-sensitive information may be intercepted and misused. A more subtle danger is that of message modification, which introduces uncertainty about the content of messages. For example, in a request/response message exchange pattern, either message can be modified in transit, resulting in misrepresentation of either the client's intent in making the request, or the server's response in fulfilling it.
These concerns are especially relevent, considering efforts [XX][XX] to standardize processing models for many transfer protocols which XML Protocol intends to use in transport bindings (including HTTP and SMTP). Interception intermediaries may imperfectly interpret message semantics, interfering with applications which use XML Protocol.
There are a number of possible remedies which, at the least, allow XML Protocol users to detect or avoid message modification:
Application-Layer Message Integrity- to detect casual or accidental message modification, a service may require a Module to carry a hash of the message. This method is vulnerable to modification of the hash itself, but exhibits low overhead.
While these mechanisms have been discussed as necessary extensions to define in XML Protocol, the possibility of modification of any XML Protocol message brings the need to use them for all messages, not only those which contain sensitive content.
The IETF WREC - Web Replication and Caching[XX] - Working Group was chartered to consider these issues which are tightly bound to intermediaries. It has produced a taxonomy for the area[XX] and a list of known problems in the field [XX], and is currently awaiting these documents to be published before closing down. The WEBI - Web Intermediaries[XX] - working group is attempting to address some of the problems identified and discussed in WREC, including communication and coherence between object authorities (origin servers) and intermediaries, as well as mechanisms to improve intermediary discovery.
The MIDCOM - Middlebox Communication[XX] - Working Group is working on frameworks for interacting with network-layer intermediaries, such as NAT boxes[XX] and firewalls. The group's future work may extend to other types of intermediaries.
At a more theoretical but no less important scale, intermediaries are part of a larger discussion on the role of the end-to-end principle[XX] in application and transport design on the Internet. Recent work by the original authors has focused on balancing the end-to-end arguments with the changing requirements of the Internet [XX].
[RFC2775]
DRAFT version 0.1 - February 5, 2001