Re: [Pub-Sub-Task] initial half-baked proposal from me from Amelia A. Lewis on 2002-11-05 (www-ws-desc@w3.org from November 2002)

From: Amelia A. Lewis <alewis@tibco.com>
Date: Tue, 5 Nov 2002 12:57:38 -0500
To: "Sanjiva Weerawarana" <sanjiva@watson.ibm.com>
Cc: www-ws-desc@w3.org
Message-Id: <20021105125738.0f0ff0df.alewis@tibco.com>
Okay.  Specific criticisms.

On Fri, 1 Nov 2002 05:34:28 +0600
"Sanjiva Weerawarana" <sanjiva@watson.ibm.com> wrote:
>     <portType name="pt1">
>         <operation name="normal-op1"> ... </operation>*
>         <event name="event-1">
>             <subscription message="subscription-message"/>
>             <notification message="notification-message"/>
>         </event>
>     </portType>
> 
> So event definitions say that they are an event and indicate
> what subscription message needs to be sent and what 
> notification needs to be sent. A port for this portType would
> have an address and that address is where the subscription
> message would have to be sent. How that message flows on 
> the wire is of course (as usual) a matter of the binding.
> So we need to define a binding .. probably something that 
> wraps the subscription message in a wrapper element named 
> event-1, but we can work that out later.
> 
> Now the question is what's in the subscription message? 
> Basically, it needs to indicate a service reference to a 
> service that has a portType with the operation via which
> the service will deliver the notification message to the
> subscribers. The service reference in the subscription 
> message is of course a reference to the subscriber. The
> notification message itself is just a regular data 
> message - contents are up to the service designer.
> 
> What's a service reference? In BPEL we published a service
> ref format which can be the basis of a more thorough 
> service reference. In fact we're finalizing such a beast
> now and hope to make that available to the WG shortly. I 
> believe such a mechanism will be sufficient to capture 
> this scenario (details TBD ;-)).

Summary:
1) "publish/subscribe" is defined as an event mechanism.
2) the subscription happens at the application level; the service is always aware of what subscribers exist.  Subscription appears to be required, and required at the application level.
3) responses to server-initiated actions are not modelled (presumably this is left for a higher layer).
4) other administrative messages are not mentioned in this proposal.
5) the administrative address is presumably the "service" address.
6) each client is actually a service, with a defined portType.

This is perfectly adequate for the subset of pub/sub that it addresses, but it only addresses a subset.  In particular:

item 2) suggests that protocols that contain built-in subscription mechanics cannot be supported, particularly if the mechanics do not reach the application level.

item 3) and item 1) imply that something like solicit/response is not to be modelled in WSDL; when a service-initiated operation soliciting response is modelled, it must be split into two pieces, which a higher-level language will reunite.

item 4) merely points out a lacuna: absence of an unsubscription mechanism is certain to provoke, from the user "HOW THE @!#$%^&* DO I TURN THIS THING OFF?".  Many protocols (or applications/services, at the level at which this proposal seems to be aimed) include additional administrative messages/operations as well.

item 5) and item 2), taken together, seem to indicate that this is intended to support point-to-multi-point communications, rather than multicast.

item 6) limits the universe of participants to those capable of description as a server.  Processors capable of acting as clients (only) are almost certainly less expensive to build.

Some examples follow, with exercises suggested.

The following is a casual list of interesting or significant or better-known publish/subscribe protocols (or protocols supporting publish/subscribe:

IP multicast
email mailing lists
usenet news
IRC
JMS topics (an API, rather than a protocol, but amenable to description)
TIBCO rv (and all of its competitors)

I'll try to avoid mention of TIBCO rendezvous, since the technology is proprietary.  I'll also avoid mention of rather obscure publish/subscribe models such as NTP, and things that might be modelled as pub/sub or as something else, such as SNMP.

Problem number one.  Not all protocols used for publish subscribe expose something that can be labelled as a subscribe operation to the application level.

Example 1a: IP multicast uses IGMP (the internet group management protocol) to register, with the nearest router, a particular host's interest in receiving packets directed toward a particular multicast address (an address in the range 224.0.0.0 - 239.255.255.255; a class D address).  That router propagates the message, in a protocol-specified pattern, to other routers.  Sending hosts never see these messages.  For a processor supporting IP multicast, the publication address (the IP multicast address) is enough information to allow subscription or unsubscription to happen.  In WSDL context, it is not reasonable each processor to be a layer 2 router.

Example 1b: Usenet news requires each client to manage its own subscription.  Client to server interactions are equivalent to: "what's the highest message number in this group?" followed by "give me synopses of the headers for messages in this list ...".  It is, by design, very similar to what one can get by direct access to the news spool.  In WSDL context, it is most reasonable to leave news propagation to news servers, and to implement services as news clients.  However, there is no equivalent to the subscription message modelled in the proposal above (or of unsubscription).

Example 1c: JMS topics are looked up via directory services, as a recommended practice.  Subscription to a JMS topic, if supported, is likely to be via a JMS message to an administrative address associated with a JMS server, not an administrative address associated with the publishing service.  Some implementations do not send any network messages at all for a subscription, because each active node already receives all messages.  A subscription method in the API simply changes internal state of the JMS library on the local node, so that it stops discarding messages with a certain topic, and instead passes them to the application layer.

Exercise 1: Using the proposal, provide a sample service over IP multicast, net news, or JMS topics.  Provide example wsdl:message outlines for description of protocol-level messages, if desired.  Indicate what address appears in the wsdl:service/wsdl:port/soap:address/@location attribute for the service, or what alternate mechanisms are to be used to identify the logical mulitcast address (alt.advocacy.soap.soap.soap, 224.31.0.2, favorite address format for a JMS implementation), and what mechanisms are to be used to identify the associated administrative address.  For IP multicast and network news, define what "administrative address" means in this context.

Example 2a: email fits the proposal fairly well.  Since the protocol does not provide a mechanism for mulitcast propagation, subscription, or administration, all of these needs have been fulfilled at the application level, and a set of conventions has grown up to describe "the usual mailing list".  Among these conventions, mailing lists almost invariably use separate administrative and discussion addresses.  The proposal does not touch on areas outside portType, but it is worth asking: what addresses are to be exposed?  Must a service conforming to the proposal use the same address for administration that it uses for "logical multicast"?  Note that, in this case, the email model matches the proposal quite well: email *is* point-to--multi-point, and so long as the service is defined to be the operator of the mailing list, it can always be aware of each and every subscriber.  As the operator of the mailing list, it can require XML format for list-administrative functions.  Nonet!
heless, the proposal seems to model *controlled* (single-publisher, "announcement" style) lists more than open lists in which each subscriber is also a publisher.

Example 2b: Most readers of the mailing list will be familiar with at least the client interface of IRC, which is sufficiently thin a layer that most can guess at the structure of the underlying protocol.  IRC bears a strong resemblance to usenet news (distributed servers; each client contacts only one but sees everything from all servers) in some respects (and many differences, of course, not least being its immediacy).  It is possible to imagine a WSDL-described IRC system in which the service is the IRC server.  It is also possible to imagine a system in which the service is just another client (or multiple services operate as clients).  In order for the subscription process to be defined in WSDL, however, one must either translate the wsdl:message definitions into IRC protocol, or modify the servers to understand an IRC+XML format that transfers admnistrative messages related to the IRC protocol itself in XML.

In sum:

1) the portType should *not* contain information about administrative operations, because
   a) it is built in to some protocols; a processor supporting the protocol must of necessity support its subscription/administration mechanism;
   b) the nature of subscription differs from protocol to protocol, so that introducing subscription into the portType forces concreteness into what should be abstract (one cannot reuse the subscription messages designed for email in a service based on IP multicast).
In other words, the administrative interface, when it exists visibly to the client and the service, should be modelled as a separate portType and associated with the publishing portType in the wsdl:binding, or possibly even in the wsdl:service definition.

2) therefore, there is no reason to call out a specific "wsdl:event" child of portType, and particularly not one that excludes solicit/response operations.

3) it should be possible to define the logical multicast address of a pub/sub service in WSDL, since this is the address *most* important to clients.

4) it should be possible to define a WSDL in which the administrative and the publishing addresses are the same, without requiring that they be the same.

I hope that this clarifies the reasons that we approached the problem from a quite different angle, as well.

Amy!
-- 
Amelia A. Lewis
Architect, TIBCO/Extensibility, Inc.
alewis@tibco.com
Received on Tuesday, 5 November 2002 12:57:59 UTC