Re: Intermediaries and XML Protocol from Mark Baker on 2001-02-08 (xml-dist-app@w3.org from February 2001)

From: Mark Baker <mark.baker@canada.sun.com>
Date: Thu, 08 Feb 2001 14:35:46 -0500
To: Mark Nottingham <mnot@akamai.com>
CC: XML Distributed Applications List <xml-dist-app@w3.org>
Message-ID: <3A82F512.38893BF5@canada.sun.com>
Mark,

Thanks for this.  It was a good read.  I like the discussion about
protocols other than just the usual suspects of HTTP & SMTP.

(BTW, the attachment was text/plain even though the content was XHTML.)

>On the other hand, intermediaries were retrofitted into HTTP to allow the
>Web to scale more efficiently. Originally, the protocol required clients to
>contact servers directly to satisfy each request. When the Web experienced
>unprecidented growth, servers and the network infrastructure could not scale
>quickly enough to satisfy demand. As a result, HTTP/1.0[XX] introduced
>intermediaries (proxies and gateways), which could take advantage of locality
>in requests to cache the responses. Although more intermediary-related
>functionality was added in HTTP/1.1[XX], the continued growth of the Web
>sparked the development of further measures; surrogates[XX] (informally known
>as "reverse proxies"), often deployed in "Content Delivery Networks."[XX]</p>

I would suggest that it's hard to claim that any feature could be
retrofitted into a 1.0 version of anything.  Versions before 1.0 are
generally considered incomplete.

Also, though proxy support wasn't explicit in 0.9, nothing prevented an
HTTP processor from accepting GETs for content not directly under its
control.

>The contrast between these examples bears examination. Because an
>intermediary model was designed into SMTP, its intermediaries perform in a
>well-defined manner, and are easy to interpose into the message path. On the
>other hand, HTTP has ongoing problems caused by the interposition of
>intermediaries; intermediaries do not always have a clear understanding of
>message semantics[XX],

How so?  I can understand this for RPC-over-HTTP solutions, but not for
normal uses of HTTP.  What is that missing reference?

> and location of an appropriate intermediary is
>problematic[XX].

Right, though I don't believe that's such a big deal.

>Overall, experience shows that design of intermediaries into a protocol is
>preferable to retrofitting them at a later date.

I'd agree with that.

> Although the most successful
>intermediary models tend to be in application-specific protocols (such as DNS,
>NTP, etc.), it is possible to do so in a transfer protocol as well.

I'm unclear what you mean here.  HTTP is both an application-specific
protocol and a transfer protocol.

>While protocols often define semantics to allow limited processing by
>intermediaries (for such things as message caching, timestamping and routing),
>they generally are either very application-specific and well-defined behaviors
>(SMTP routing), or weak, advisory controls (HTTP caching). Recently, there has
>been work to retrofit a more capable processing model into HTTP [XX][XX].
>Unfortunately, it faces a number of problems due to the fact that HTTP is
>already widely deployed.</p>
>
>Message processing by intermediaries that do not act on behalf of either
>the message sender or reciever may introduce privacy, security and integrity
>concerns, as they are capable of examining and modifying the message without
>the knowledge of either party.

Right, but that's the point.  HTTP's trust model is one where explicit
trust is necessary for composing the chain.  The term "weak" above
appears to suggest that this is necessarily something that needs
improving, but I don't believe that's the case.  Or more precisely, I
don't think it needs improving to solve the problems HTTP was designed
to solve.

>XML Protocol is also somewhat unique in that it is an explicitly layered
>solution, and may either be an application-layer protocol in itself, or may be
>used in conjunction with another transfer protocol. For example, the default
>binding is HTTP;

I disagree that XML Protocol may be an application-layer protocol.  We
are chartered not to define any application semantics of our own, and
though an out exists for us to do so, I don't think it should be
assuming that we will take it.  For the RPC use of XP, it too defines no
application semantics; that can only be done by agreeing on some APIs or
some parameters.

>For the purposes of XML Protocol, it may be most useful to disregard the
>extremes; exclusively low-level (such as physical and network transport) and
>high-level (such as business logic) intermediaries do not add substantial
>meaning to the XML Protocol model.

I'm unclear what you mean here, wrt the "business logic" comment.  Can
XP intermediaries not be used for composing processors of business
logic?

>The ability to target XML Protocol Modules to specific intermediaries
>brings about the need to find a way to nominate them. Additionally, the status
>and error reporting requirements need a mechanism to identify the intermediary
>which generates such a message. There is no URI scheme specified for
>identifying an intermediary; schemes such as HTTP are meant to identify
>endpoints.

I agree, but it's a bit early to say that.

>The HTTP does provide for the identification of intermediaries in the Via
>header, but does not tightly specify a naming convention. As a result,
>definition of a new URI scheme may be required to accommodate intermediaries,
>depending one whether or not it is judged important to distinguish them from
>protocol endpoints.

Editorial; don't need "The" to start.  s.b. "on" instead of "one".

>Although intermediaries are explicitly defined and accommodated by XML
>Protocol, they can only be functional if there are application semantics
>defined to take advantage of them.

Amen!

>XML Protocol's Modules offer an excellent
>opportunity to standardize common intermediary functions.

I suggest that a far more suitable place would be in the application
protocol that XP will be used on top of; that's why we call them
application protocols 8-).

We really need to focus on reusing the established application semantics
that exist in deployed application protocols today.  For new semantics,
people will have a choice; extend the application protocol via its
documented extension mechanisms, or use the XML envelope of XP.  Each
has its pros and cons.

>Some XML Protocol applications may wish to make caching possible for
>latency, bandwidth use or other gains in efficiency. To enable this, it should
>be possible to assign cacheability in a variety of circumstances.

How can a cache model be defined at this layer when there are no
application semantics?  How messages are cached depends entirely on how
they are transferred.  I'd like to be shown otherwise, but I cannot see
how a useful caching model could be created independantly of any
specific application protocol.

I believe the furthest that we could and should go in this space, is to
perhaps define some metadata that can be used to describe the
application-neutral cacheability properties of XP messages.  Then, in
the protocol bindings, we could define how they bind to the cache models
of the application protocols.

[re message integrity]
>While these mechanisms have been discussed as necessary extensions to
>define in XML Protocol, the possibility of modification of any XML Protocol
>message brings the need to use them for all messages, not only those which
>contain sensitive content.

I don't see the need to use them for all messages.  Plus, some
application protocols that have identified a need for maintaining
message integrity, support it already.  So I'm not sure what requiring
it at the XP level would achieve (though I could see integrity at the
body level being important, and not the envelope).

MB
Received on Thursday, 8 February 2001 14:34:40 UTC