Proposed (alternative) Internet Email Binding

Status

This document has no official standing within the XMLP, WSD, or WSA working groups at W3C. It is not an official document of Tibco Software Inc., but is submitted by the Tibco Software, Inc., representative to these working groups. The document reflects the experiences and concerns of developers involved in enterprise messaging, and attempts to characterize such messaging as web services.

[Author's note: as a proposal without formal status, the URIs assigned throughout this document are under the administrative control of the author's employer, except where existing features, specified in other documents, are referenced (some external references are also in local administration). All such namespaces can be identified by the initial pattern: "http://www.tibco.com/xmlns/"... It is expected that these URIs will be updated to point into spaces under the administrative control of appropriate W3C committees.]

Motivation

At least two previous proposals have been put forward for binding SOAP to internet email. This proposal differs primarily in focusing on list-oriented, publish/subscribe models. Insofar as the request-response exchange pattern is treated, it is significantly less prominent than in earlier proposals. That pattern requires only minor modifications for use within a binding which is, by default asynchronous.

The primary purpose of this binding is to illustrate SOAP bound to a different paradigm, specifically the publish/subscribe model, with asynchronous delivery. If current discussion of SOAP transport bindings may be said to be too focussed on the problems of the HTTP binding, to the detriment of SOAP, of HTTP, and of XML as a data exchange format, then this proposal seeks to widen the scope of application of SOAP solutions.

The current model of service description cannot handle publish/subscribe models, except at the cost of promoting clients into published services, a price too high for many participants to pay. Description can only be fleshed out with an adequate underlying set of abstractions.

This binding composes several proposed features and message exchange patterns into a single, complete description of the publish/subscribe model as implemented via internet email. The design of the proposal, by composition rather than monolithically, is intended to allow and even promote similar compositions for other bindings. Internet email was chosen for illustration not because it is the best example of the publish/subscribe model, but because it is by far the most familiar and accessible, which (it is hoped) will promote both understanding and implementation.

Table of Contents

Status

Motivation

Table of Contents

Introduction

Definitions

Supported Message Exchange Patterns

Supported Features

Message Exchange Operation (informal)

Issues

Security

Appendix: Example

Introduction

This SOAP binding specification adheres to the SOAP 1.2 binding framework, and as such uses abstract properties as a descriptive tool to define the functionality of certain features. This binding imports properties from separately-defined features (including message exchange pattern features), which it then specifies in greater detail, and on occasion further constrains.

The SOAP binding to internet email is not a "protocol" binding, as it effectively binds to SMTP (RFC 2821), the Message Format Standard (RFC 2822), MIME, IMAP, POP, and others. That is, it is expected that messages transmitted using this binding will travel over multiple protocols, and a significant portion of the binding is concerned with defining the interaction of data formats (Message Format Standard, Multipurpose Internet Message Extensions, encoding issues, and the like). However, the concept of "internet email" is relatively well understood as the transmission of messages conforming to the MFS, at least partially via SMTP, and characterized by the use of the mailto URI scheme.

1. Definitions

1.1 Binding Name

The binding described here is identified with the URI

http://www.tibco.com/xmlns/soap/bindings/distEmail/

This will be referred to hereafter as email or email binding.

The binding described here is provided as an alternative to two currently existing proposals for binding SOAP over internet email.

1.2 Prefix Mapping Definitions

Due to its design via composition, this binding makes use of a fairly large number of namespaces. In general, each namespace supplies properties which are used within the binding.

Table 1: Namespaces and Prefixes Used
Prefix Namespace
context http://www.w3.org/2002/06/soap/bindingFramework/ExchangeContext/
email http>//tibco.com/2002/soap/bindings/distEmail/
mep http://www.w3.org/2002/06/soap/mep/
fail http://www.w3.org/2002/06/soap/mep/FailureReasons/
arr http://www.tibco.com/xmlns/soap/mep/async-request-response/
confirmhttp://www.tibco.com/xmlns/soap/mep/confirmation/
solicit http://www.tibco.com/xmlns/soap/mep/solicit-response/
notify http://www.tibco.com/xmlns/soap/mep/notification/
address http://www.tibco.com/xmlns/soap/message-address/
corr http://www.tibco.com/xmlns/soap/message-correlation/
faildesthttp://www.tibco.com/xmlns/soap/failure-destination/
mimeconthttp://www.tibco.com/xmlns/soap/mime-content/
mimecomphttp://www.tibco.com/xmlns/soap/mime-composite/

1.3 Property Definitions

This specification makes use of the properties listed in table 2.

Table 2: Property Names, Types, and Constraints
Name TypeConstraints
context:State enum
context:ExchangePatternName anyURI
context:FailureReason enumdefined in the fail namespace
context:CurrentMessage message
context:TriggerMessage message
context:ImmediateSource anyURI
context:ImmediateDestinationanyURI
arr:Role enum
confirm:Role enum
solicit:Role enum
solicit:TermCondition enum
solicit:Synchronous boolean
solicit:NumRespondents int
solicit:Deadline date
notify:Role enum
address:original-source anyURI
address:final-destination anyURI
address:response-address anyURI
corr:message-id string
corr:references array of string
faildest:failure-destinationanyURI
mimecont:content-type string
mimecont:transfer-encoding string
mimecont:* any
mimecomp:content-type string
mimecomp:content-id string
mimecomp:content-location anyURI
mimecomp:current-part message part

[[Commentary: notice that MEPs tend to always define Role. It should probably be considered candidates for generalization, or for inclusion in context with per-MEP specialization. Note that although correlation and addressing properties are required by several different MEPs, they appear here (and are bound and constrained) only once.]]

1.4 Property Bindings

This section discusses how each of the properties present in this binding is bound into the environment, and how each property is transmitted (if necessary) from node to node. In the case of the email binding, most properties are bound into headers compliant with RFC2821 (note encoding issues, however).

context:State, context:CurrentMessage, context:TriggerMessage
These properties, defined in the context namespace, are the result of processing. The state property may not be communicated from node to node; it is an internal property local to each node's processing of a particular message. CurrentMessage refers to the message being processed. For exchange patterns that involve response semantics, a context:CurrentMessage will at some point become context:TriggerMessage, which is then used for the construction and elaboration of a new context:CurrentMessage (the response).
context:ExchangePatternName
The context:ExchangePatternName property is derived by each node from an examination of its role in a given exchange. In some cases, it may be hardcoded. In general, a node initiating an exchange chooses the name (but need not embed it in the message). A node receiving a message should be able to determine, from the message and the fact that it was received, which exchange pattern it is participating in. Again, the information is not transmitted from node to node.
context:FailureReason
The context:FailureReason property is transmitted from a receiving node, when necessary, to the location defined for the delivery of failure notifications. It is bound into the soap fault message, per other specifications.
context:ImmediateSource
For the email binding, context:ImmediateSource is problematic. context:ImmediateSource should probably refer to the immediately preceding SOAP node. It is recommended that custom headers be bound to record the path through intermediaries.
context:ImmediateDestination
If a message is delivered or received via SMTP, the context:ImmediateDestination property should, in general, be bound to the content of the envelope RCPT TO: directive. Unfortunately, the path of a message through the email system is likely to involve portions that are not travelling over SMTP (mailbox delivery, mailbox retrieval; email is a store and forward technology). Where the message is retrieved via a retrieval protocol, such as POP or IMAP (or direct mailbox access), the ImmediateDestination property should be the name of the mailbox being accessed, instead. In general, email headers are not useful for binding this property; the property is emergent from, rather than contained in the message. Also see the discussion of addressing issues in section 5.1, below.
arr:Role, solicit:Role, notify:Role
Each message exchange pattern defines a set of roles (always at least two, and rarely more than two, and a generalization might suggest an initiator and a respondent as generic roles instead). These roles are not bound to any information contained in the message. Rather, each node should be able to determine, from the way in which it participates in the pattern and the definition of known patterns available to it, what role it is currently playing in a particular exchange pattern. This information does not need to be transmitted from node to node (except, possibly, implicitly).
address:original-source
In most implementations, this property should be bound to the content of the From header. However, it is permitted that implementations bind to the Sender header, if so specified in the service description. Sophisticated implementations might wish to specify an order of binding to the available headers (but binding to the MAIL FROM envelope directive does not appear to be practical), if such seems to have utility. Also see the discussion of addressing issues in section 5.1, below.
address:final-destination
The final-destination property should generally be bound to the To: address. FinalDestination is intended to represent the target address, not necessarily the name of the mailbox to which the item is to be delivered. The context:ImmediateDestination property will contain the mailbox name of the target address. Note that the implication here is that a To: address may contain a list name, while the individual list members are contained (invisibly to the recipient) in a Bcc: field. Also see the discussion of addressing issues in section 5.1, below.
address:response-address
Normal binding semantic for the response address property is, in order: the Reply-To: header, then the From: header, then the Sender: header, then the Return-Path: header. A message with a deliberately empty Return-Path: header (Return-Path: <>) should always be recognized as requiring no reply (which is not the same as not requiring a reply). Service descriptions may specify a different header order, or alternate (application-defined) headers, but in the absence of such specification, the above order is used. Also see the discussion of addressing issues in section 5.1, below.
solicit:TermCondition
A solicitation message in the solicit/response exchange pattern (q.v.) must include a binding for the solicit:TermCondition property. An application may bind this property in some other fashion (for instance, in a SOAP header), but if a service description does not otherwise specify the binding, an implementation should bind solicit:TermCondition to a custom header, X-TerminationCondition. This header should contain one of the possible values of the enumerated type for termination condition. If the property cannot be found, or is invalid or corrupt, then it defaults to the termination condition value "Notification".
solicit:NumRespondents
When a solicitation message in the solicit/response exchange pattern has the property solicit:TermCondition equal to Number, then the property solicit:NumRespondents must also be bound. Unless the service description for the application sets an alternative binding, this property should be bound to the custom header X-MaxNumberRespondents. If this header is missing, or less than one, then only the first response will be accepted.
solicit:Deadline
When a solicitation message in the solicit/response exchange pattern has the property solicit:TermCondition equal to Time, then the property solicit:Deadline must also be bound. Unless the service description for the application sets an alternative bdining, this property should be bound to the custom header X-ResponseDeadline. The value there should be chronologicall after the value of the Date header (except for a "tease" service, presumably). If the value is invalid, or nonexistent, then the termination condition changes to Notification rules instead.
solicit:Synchronous
The email binding does not support the synchronous variant of solicit/response (or synchronous anything, for that matter). Implementations MUST NOT provide a binding. Messages with this property set (according to the rules of some other binding, since no binding is provided herein) must be handled otherwise, which may invite a fault.
corr:message-id
This binding does not recommend the use of the Message-Id header for binding of the corr:message-id property. See section 6.2, below, for discussion. Applications are required to specify, in the service description, how this property is to be bound. It is recommended that the message id property be composed, by concatenating the content of the email address (without user-friendly display name) in the From: field, the content of the Date: header with spaces changed to underscores, and the content of the Subject: header, again with spaces changed to underscores. Applications that prefer to use the Message-Id field should generally require the server to send itself a copy of every message, in order to assign the message-id property from a Message-Id header created by a fully fledged SMTP server. It is permitted, but not recommended, that each client be required to implement the algorithm for generation of the Message-Id header. Applications may also use alternative bindings, including concatenation of some other combination of existing header fields, or specification of an application-specific field (X-SOAP-Message-Id is recommended, in the latter case).
corr:references
Because of the difficulties in obtaining a copy of the outgoing Message-Id header (see the discussion in corr:message-id above and in section 6.2, below), this binding does not recommend use of the In-Reply-To: or References: headers for message correlation. If the application has bound the corr:message-id property to the Message-Id header (either by requiring a self-copy of every message, or by requiring implementation of the algorithm for every client, or for the very foolish and arrogant by implementing an SMTP server as a SOAP node), then it would follow that the application may bind the corr:references property to the References header (with the most recent message ID in the In-Reply-To header). When some other binding is used for corr:message-id, it is recommended that the corr:references property be bound to a custom property, such as X-SOAP-References. However, applications may create alternative bindings for this property.
faildest:failure-destination
The preferred means of binding faildest:failure-destination is to a header (X-Failure-Destination is suggested). Applications may specify alternate bindings, or may provide no support for the property. Note that applications may bind to a series of destinations, listed in preferred order, which may include (for instance) the response address or the origination address.
mimecont:version
The mimecont:version property is mapped to the MIME-Version header. It SHOULD always appear, but if it does not, defaults to the string "1.0".
mimecont:content-type, mimecomp:content-type
In the email binding, the mimecont:content-type or mimecomp:content-type property MUST be bound to the Content-Type: header. No other binding is permissible. An application may repeat this information elsewhere, if desired, but it MUST bind the property to this header. If the header contains a composite primary type (multipart or message), then the property bound is mimecomp:content-type. If it is any other primary type, then the property bound is mimecont:content-type. Applications that do not support composite types may generate a fault if they encounter composite types in the header (in effect, the property cannot be bound).
mimecont:transfer-encoding
In the email binding, the mimecont:transfer-encoding property MUST be bound to the Content-Transfer-Encoding: header. No other binding is permissible. Note that there are significant limits on the type of encoding that may be used to represent XML; XML cannot be represented using a 7bit encoding (which is the default). See section 6.5 for a discussion of encoding issues as applied to the email binding. Note that since the default value for Content-Transfer-Encoding, if the header is missing, is 7bit, then absence of the header may be considered grounds for generating an error (encoding or decoding failure).
mimecomp:content-id
In general, the mimecomp:content-id property should only appear and be bound in multipart messages, and it should generally be used solely for linkages between the parts of a message. If present, it MUST be bound to the Content-Id header of each message part. No other binding is permissible.
mimecomp:content-location
In general, the mimecomp:content-location property should only appear and be bound in multipart messages, and it shuold generally be used solely for linkages between the parts of a message. If present, it MUST be bound to the Content-Location header of each message part. No other binding is permissible.
mimecomp:current-part
The mimecomp:current-part property is bound to each of the MIME parts of a composite message in turn (order specified by the processor, possibly). It has no direct binding, but is emergent from processing.
mimecont:*
Other properties corresponding to MIME headers are bound to the corresponding header.

2. Supported Message Exchange Patterns

An instance of a transport binding to internet email and conforming to this specification MUST support the following transport message exchange patterns:

Confirmation Message Exchange Pattern
The confirmation pattern is required for service registration, where services are provided via mailing lists, and can be used for the "confirmed command" pattern as well (notifications from the client to the server, with client confirmation). For details on property bindings, and additional constraints on those properties, see the property descriptions (above).
Solicit/Response Message Exchange Pattern
The solicit/response pattern is required for service provision via mailing lists, when a single solicitation is expected to bring multiple responses, and may also be used for singly-targeted notification services requiring responses. For details on property bindings, and additional constraints on those properties, see the property descriptions (above).
Notification Message Exchange Pattern
The notification pattern is required for service provision via mailing list, where the targets of notification are not expected to respond. For details on property bindings, and additional constraints on those properties, see the property descriptions (above).
Asynchronous Request/Response Message Exchange Pattern
The (asynchronous) request/response pattern is required for service provision via mailbox. That is, this pattern supports servers that wait for commands, and respond to them (rather than initiating operations themselves). For details on property bindings, and additional constraints on those properties, see the property descriptions (above).

3. Supported Features

Several known features are directly supported by the email binding; three are required, two optional. Most properties defined by these features are bound to headers in the internet message format.

Required Features

An instance of a transport binding to internet email and conforming to this specification MUST support the following features. Note that several of these features are indirectly required by message exchange patterns which require them.

Message Correlation
Message correlation is a required feature for any service using the email binding. The required property, corr:message-id, must be bound (the service description should define the binding; see suggestions above). The optional feature corr:references may also be bound (it is recommended that this be done, unless there is reason not to).
Message Addressing
Message addressing is a required feature for any service using the email binding. Addresses are bound to mailbox names, in RFC2822 format; the headers bound depend upon the particular service used (see above). The property address:response-address is only bound when an exchange includes multiple messages.
MIME Content
The MIME Content feature is required for any service using the email binding. Without the required property bindings (which as a rule MUST NOT be bound in any fashion other than that described above), problems of interoperation due to encoding issues, at the very least, make the email binding less than useful. Applications MAY bind additional MIME-related features (the set of properties dubbed mimecont:*), but MUST bind them to the corresponding MIME headers, if so.

Optional Features

An instance of a transport binding to internet email and conforming to this specification MAY or SHOULD support the following features:

Failure Destination
Services using the email binding are strongly encouraged to implement the Failure Destination feature, possibly binding to a list of properties that indicate an appropriate address. The binding of the property should be specified in the service description.
MIME Composite
The MIME Composite feature provides minimal semantics for the support of SOAP with attachments. It is optional. Services must be able to distinguish between the Content-Type header values associated with the MIME Content feature and those associated with MIME Composite, even if they do not support the composite types (they must be able to provide useful errors). As with other MIME-related properties, the required bindings are specified here, and must not be bound otherwise by the application.

4. Message Exchange Operations

The Transport Binding Framework, Message Exchange Pattern Specifications, and Feature Specifications each describe the properties they expect to be present in a message exchange context when control of that context is passed between a local SOAP Node and a local Binding instance, and vice versa. This specification adds constraints to some of the supported features and MEPs, but leaves some options available to the particular service (or, if the description language supports it, allows communication of certain properties on a per-exchange basis).

4.2 Confirmation Exchanges

The email binding typically uses the confirmation exchange pattern for operations related to the administration of a mailing list. Services which are not distribution oriented are unlikely to need this operation type.

Confirmation-pattern operations may use addresses which are different from the addresses which they administer (indeed, this is the common case). A description or flow language will generally associate the administrative addresses and operations with the addresses and operations which are administered.

4.3 Solicit/Response Exchanges

The email binding implements the solicit/response exchange pattern in its asynchronous mode only. The solicit:synchronous property may not be set to true. A form of operation similar to synchronous operation in most respects may be achieved by restricting the solicitation to a single subscriber, and setting solicit:TermCondition to First (or to Number, with solicit:NumResponses set to 1 or less).

It is anticipated that the more common use of the solicit/response pattern is to deliver the solicitation message to multiple subscribers, receiving zero or more responses. Subscribers (clients) may wish to vary their behavior based on the termination condition. A separate operation may be defined for solicit/response patterns terminated via notification.

The distribution list (subscriber list) for a solicitation message is not defined in the service description, but may be hard coded. More often, it is established and dynamically updated using an additional, or related operation (typically using the confirmation message exchange pattern). The subscriber list is always represented by a single address, which should be specified in the service description.

4.4 Notification Exchanges

The email binding is highly suitable for the notification exchange pattern. It is anticipated that the common use of the pattern will deliver notifications to multiple subscribers, but it may also be used to deliver notifications to single recipients. Since the notification message is fire and forget, the service is agnostic.

The distribution list (subscriber list) for a solicitation message is not defined in the service description, but may be hard coded. More often, it is established and dynamically updated using an additional, or related operation (typically using the confirmation message exchange pattern). The subscriber list is always represented by a single address in the email binding, which should be specified in the service description.

4.5 (Asynchronous) Request/Response Exchanges

The email binding also supplies support for the common request/response exchange pattern semantic. In request/response, it is expected that the requesting node will be a single node; the pattern does not involve multiple responses. The request/response pattern defined in part two of the XML Protocol specification is inadequate for support of the email binding; this binding requires at least the message correlation feature, and requires the ability to determine an appropriate return address.

The service description for an asynchronous request/response exchange pattern specifies the incoming mail address for the service, as well as the format of request and response messages.

5. Issues

The SOAP binding to internet email is complex (not only by virtue of binding loosely to multiple protocol and format specifications), and differs strongly from existing binding specifications in a number of ways. It is worthwhile, therefore, to highlight those elements which this binding brings to the fore, in order to suggest where issues that have arisen here might need consideration for other new protocol bindings, or even for existing protocol bindings.

5.1 Addressing

Binding of address-related properties for email is problematic. The scope of the problem can be sensed by listing the headers related to target addressing, to source addressing, and to route recording.

Target addressing headers and fields

Source addressing headers and fields

Route recording and response headers and fields

From the above, it is clear that a sophisticated binding could make extensive use of various different headers, apart from the specification's invitation to applications to use application-specific extension headers (with the X- prefix) for application-specific usages. It is not entirely clear what the best binding of the various address related properties should be.

To add to the complexity, current bindings do not suggest that addressing is a generic kind of property, except via the ImmediateSource and ImmediateDestination properties, which seem deliberately designed to promote synchronous protocols over asynchronous. It is easy enough to know the ImmediateSource if it is a currently-open socket; if that isn't the case, then the message must be made to contain that information. This creates a greater burden for binding of other properties (such as OriginalSource and FinalDestination), because in a world without intermediaries, these properties directly correspond to ImmediateSource and ImmediateDestination. In order for the distinction to have meaning, the binding effectively requires an originator to know whether the message is going to pass through intermediaries or not, which is an unnecessary burden.

5.2 Message identification

The email protocol specifications contain three headers which appear, on first glance, to be completely ideal for use in message correlation: Message-Id:, In-Reply-To:, and References:. Unfortunately, security considerations lead to a very strong recommendation that SOAP nodes not attempt to implement an SMTP server (see section 7 introduction, below). This makes the use of these header fields problematic, at best.

Specifically, clients are encouraged not to attempt to set a Message-Id header. Instead, the algorithm (which has good characteristics for creating unique messages, if properly implemented) to generate a message id is supposed to be applied by the first server to receive a message that does not have a message id already set. The consequence of this is that a client, as a rule, does not know the message id of any messages sent (unless the client sends itself a copy), and therefore cannot use this useful and unique identifier for correlation. The server does not supply this information to the client (in part because the client supplying the message could, in theory, be a server (a badly written server, mind)).

This means that some other means of identifying messages becomes necessary, and that there is a greater chance that the chosen identifiers will collide. Therefore, this binding recommends a concatenation of informational fields that should lead to relatively unique IDs, and also permits the application to define other sorts of identifications.

5.3 Multiple recipients

Among the problems illuminated by the internet email binding, the issues of addressing and correlation are probably the most significant. Addressing issues have already been alluded to, and are further developed in the discussion of fault routing, below. The question of correlation very strongly arises in the context of multiple recipients.

Internet email is commonly delivered to aliases, which may be individuals or groups of individuals. Mailing list software may establish a mapping from a list address to a set of mailboxes, or the mail system may do so itself (in which case it may be accessible via the SMTP EXPN command). When such messages are delivered, they should be identifiable, and when messages related to them are generated, there should be some way to establish this.

The obvious means of establishing such identity and correlation is to use the headers built into the internet message format for that purpose: Message-Id, In-Reply-To, and References. Unfortunately, as noted in the discussion of the corr:message-id property binding, the requirements of identification for SOAP are not the same as the requirements for SMTP. Put simply, SMTP identifies messages so that servers can avoid mail loops. Internet message format defines additional headers (In-Reply-To and References) that build on this very restricted message identification functionality to provide message associations. Unless a developer is willing to implement in a client functionality that is normally delegated to a server, however, internet message format identifiers are poorly suited to support the requirements of SOAP correlation.

SOAP correlation requires that the sender, as well as the receiver, be able to identify a message, and that both endpoints be able to identify messages in the same way. Because the Message-Id header is expected to be constructed by an SMTP server, after it has passed out of the view of the client, it does not meet this requirement. The sending client is not aware of the id attached to the sent message. Since In-Reply-To and References build on Message-Id, they share this deficiency. One solution is to always copy the sender, on any message, but this is less than ideal (loss of a message is a particular problem, and email is not known for its delivery guarantees).

As a result, this specification suggests using other identifiers for identification, and building on these alternate identifiers for correlation. Applications should examine the issue before specifying a binding.

5.4 Asynchronicity

The fact of asynchronicity in the email binding presents some interesting issues, some of which simply expose possible weaknesses in current definitions. For instance, asynchronous delivery to potentially many (but possibly zero) subscribers leads to a much more complex interpretation of the completion state, which is elsewhere described simply as "success" or "failure." For exchanges with multiple recipients, and especially exchanges with multiple respondents, "completion" may be successful in some sense, unsuccessful in some sense, and both successful and unsuccessful in some sense.

A further issue in this regard is that email is a store-and-forward technology. This means, in effect, that delivery is not equal to receipt, although this is usually only relevant when the delivery status notification extension is available (when the sender gets notification of a delivery). This provides a stong distinction to synchronous protocols, in which one can generate an error if receipt is not equivalent to delivery.

A number of features have been proposed to support asynchronous communication and the complexities that arise therefrom. Some of the problems lead to a different description of the state machine (a terminal state called "completion" rather than two terminal states, "success" and "failure").

5.5 Encoding

The issue of encoding presents severe compatibility problems when services are defined to use the internet email binding. The basic problem is that XML, the data transfer exchange format, is defined in terms of unicode. Internet email is defined in terms of the Network Virtual Terminal, which requires no more than 7bit ASCII (and even then does not guarantee transmission of control characters).

It is possible, in a specification, to require that services use extensions that permit cleaner support of 8bit character sets. Unfortunately, it is not possible, in the current state of the TCP/IP network, to actually implement this restriction unless all of the SMTP servers are known and controlled (this would involve requiring the 8BITMIME extension, which remains incompletely supported). It is far easier to require MIME support for simple types (which this specification does), for instance.

Further problems arise in the presence of the xml declaration, which may supply an encoding (which corresponds to a charset in MIME). The interaction of potentially unsynchronized properties must be a concern of service and client developers. The issue of character sets is further complicated because of the possible transfer encodings (7bit, 8bit, binary, quoted-printable, base64), which interact with the character set definition--in a rather unpleasant fashion--if the character set is anything other than ASCII. HTTP, developed ten years and more after SMTP and able to rely upon the advances of those years, specifies an eight bit clean message path. SMTP and supporting protocols emphatically do not, which means that the problem surfaces at the application level.

This specification attempts to provide the tools to solve the problem, by requiring the MIME content feature and recommending the MIME composite feature. Applications should give careful consideration to the potential problems. The binding specification cannot provide full resolution of the consequences of this issue.

5.6 Return Paths and Alternate Routing

The problem of addressing in email has already been discussed. The issue of return paths is a part of that problem. However, the issue of alternate routing of error messages is also worth raising. It is very common, in mailing lists, that errors are reported to an address other than the distribution (and often other than the administration) address.

This sort of requirement, together with problems of reporting errors at all in one-way and notification message exchange patterns, strongly motivates the development of a separate feature to identify the preferred destination of error message (at the SOAP level in this case). The failure destination feature is therefore recommended.

Services making use of the email binding may wish to consider additional issues of routing, and possibly adopt features that support extended routing properties.

6. Security Considerations

Principle: do not create an SMTP server as part of a SOAP application, unless you fully understand how to protect it from abuse.

The more general principle: do not create a messaging server or router as part of a SOAP application. It is perfectly possible to create a SOAP service which interacts with messaging routers as a client, rather than replicating the functionality of a router. This principle needs to be stated up front, as creation of a server for HTTP is a fairly common design for SOAP applications. The fundamental difference is that an HTTP server only responds to the initiator. SMTP, and messaging routers in general, provide little response to the initiator, but then produce potentially significant amounts of material to send to target addresses. Failure to take into account any of hundreds of nuances in server design can easily lead to mail loops, painfully prolonged and repetitive attempts to deliver the undeliverable, and network floods. It is therefore strongly recommended that SOAP applications binding to internet email never expose a public SMTP server to the internet. The logs of the sendmail, postfix, and qmail MTA mailing lists should provide adequate support for this position.

If this principle is followed, then the primary security considerations for a SOAP application bound to internet email must take cognizance of issues in SMTP, POP, IMAP, etc., but need not revisit the entire corpus of security alerts related to internet email. Instead, the focus in the points that follow is on security issues that are of particular applicability to SOAP applications. Developers who wish to implement general SMTP servers as part of a SOAP application are directed to the raft of security alerts and issues associated with SMTP as well.

6.1 SMTP Forgery

SMTP does not provide an authentication model. As a rule, SMTP servers will always accept mail for a "local" address. Some may have authentication routines to permit local addresses to send mail out, but this is by no means universal. As a rule, SMTP servers are perfectly willing to accept forged "From" headers (which are typically associated with the origination address).

Virtually every service delivered via internet email must address this issue to some degree. The service may resolve the issue as simply as stating that it is an application responsibility, or it may require certain SMTP features to be implemented by all servers used in mail transmission (an unlikely requirement, unless the system is controlled end to end). Origination addresses in the SMTP/Internet Message Format headers cannot be relied upon. Envelope addresses in SMTP cannot be relied upon. Email forgery is trivial, and the protocols do not supply broadly deployed solutions. The service or application must address the issue, if the questions of authentication and repudiation are significant in service context.

6.2 Traffic Analysis

Internet email traffic is vulnerable to traffic analysis. There are a few methods for disguising the fact that two nodes are communicating, but they are not very effective. Services should not be created which rely upon resistance to traffic analysis, because internet email hasn't any. Services may be built which can compensate for this, but such enhancements are well out of scope for discussion here.

6.3 MRA Authentication Weaknesses

Mail retrieval agents (POP, IMAP, mbox, Maildir, mh, and proprietary clients) have a long and inglorious history of abysmal security. Most authentication protocols are clear-text (username and password passed in clear text over the network). Many existing servers (POP, IMAP, or machine login) do not implement strong authentication protocols, or when they attempt to do so, fail.

Confidentiality may then be breached at the point of retrieval. Moreover, with some protocols, bogus messages may be inserted at this point. As with message origination address forgery, the greater part of the responsibility for verification of received/retrieved messages is left to the SOAP application.

6.4 Storage Vulnerabilities

Depending upon implementations, messages transiting a service implemented over internet email may be vulnerable to snooping, tampering, destruction, replacement, or injection. This may happen, as previously noted, at the origination point and at the point at which the mail retrieval agent acquires the message from its delivery point, but as SMTP is a store and forward technology, it may also occur at multiple additional points along the route. This is not, from the attacker's point of view, an ideal opportunity, but if no others exist, it may nonetheless be attractive.

At any point at which a message is stored along the route from sender (or malicious injector) to receiver, it may be vulnerable to modification, and is almost certainly vulnerable to snooping. Certain forms of snooping (target address, for instance) are nearly impossible to defend against (the address has to be available in order for delivery to take place). For the remaining issues, the internet email system provides very little help to the application. It is up to the service to utilize external features (such as SAML, WS-Security, and the like) in order to achieve the level of security, authentication, and non-repudiation required by the service.

6.5 UCE, Viruses, and Other Issues of Malicious Abuse

Unsolicited Commercial Email (more generally, Unsolicited Bulk Email (UBE), or colloquially "spam") is a relatively significant problem in the current internet email environment. SOAP services that implement the publish/subscribe model thereby create lists of valid email addresses (albeit addresses that may not be monitored by live humans). Exposure of such a list is likely to result in a flood of spam to the subscribers.

This presents a number of security issues for consideration. First, a SOAP service or subscribed client MUST be able to handle (by discard, if nothing else) messages which do not conform to the SOAP specification. Second, such services and clients MUST be able to extract administrative control messages, and route them to an appropriate authority for resolution (recognizing root, MAILER-DAEMON, and Postmaster addresses is a minimum requirement). A service SHOULD NOT remail blindly (see the first principle of binding to internet email, above). Finally, services and subscribed clients SHOULD have recovery plans to deal with what amount to denial-of-service attacks performed by clueless bozos determined to MAKE MONEY FAST.

Apart from the greed that drives marketers to use customer's money, time, and equipment to reduce their own costs, absolute malice seems to drive other abusers. This typically takes the form of worms, trojans, and viruses; email is possibly the most common distribution mechanism. While most SOAP clients are unlikely to be the security problems that some user-facing clients are, SOAP clients (and servers) are expected to execute programs based on received content. Developers should be careful that this does not create paths for delivery of any form of malware, either to themselves or to correspondents.

Appendix. List-based Market Service Example (non-normative)

Put a discursive description here. Service sends out RFPs. Has associated administrative service to acquire and remove subscribers. Each RFP is solicit-response; termination of bidding is indicated by a notification. Fulfillment is out of scope (it's gonna be added later, RSN).

A.1 Requirements

A.2 Service Design

A.3 Service Description

A.4 Service Usage Example


Amelia A Lewis
Last modified: Wed Oct 23 14:12:26 EDT 2002