Robustness Considerations for XML Protocol Applications

Phillip Hallam-Baker (VeriSign Inc.)

Abstract

We consider two issues that arise in the design of XKMS; avoidance of deadlock and livelock and security. These issues have general applicability to most XML Protocol applications.

Avoidance of Deadlock and Livelock

[See Rosecoe & Dathi, The persuit of deadlock freedom, Oxford Programming Research Group]

Problem

Deadlock and livelock are both global conditions that may arise in a network that either prevent the network from doing further work entirely or significantly degrade the performance of a network. These conditions are defined formally in the manner of Roscoe and Dathi [Roscoe] as follows:

Deadlock
The condition deadlock occurs in a network if and only if  there exists a blocked cycle of un-granted requests.
Livelock
The condition livelock occurs in a network if and only if there exists an unbounded  cycle un-granted requests.

For example Web Service A receives a request which is forwarded to B which in turn forwards (i.e. chains) the request to C which forwards the request back to A. If the Web service at A can only accept one request at a time the cycle of requests will deadlock. If A can handle multiple requests at once it will forward the request on to B which will forward it to c etc. until the resources of one service in the cycle are exhausted and the service fails.

Livelock and Deadlock conditions may also occur across Web Service specifications. For example the Web Service A may receive an XKMS request that results in a directory lookup to Web Service B which in turn results in a service mapping request to Web Service C which makes an XKMS request if A. It is important therefore that the issues of deadlock and livelock be considered at the XML protocol level and not just at the level of individual service specifications.

The term 'chaining' is thus preferred to 'forwarding' in the context of this problem since a forwarded request is implicitly identical to the original while a chained request may be any request made to answer another.

Solutions

Deadlock and livelock are both global conditions and cannot be detected by inspection of the state of a single node. There are however well known approaches that may be used to avoid deadlock and livelock conditions.

Time Out

The simplest approach (and the least desirable) to addressing race conditions is use a time-out mechanism ensure that if a race condition does occur its duration is limited. Each service defines a time out value, being the maximum amount of time that the service will allow for servicing a query.

Time outs introduce many undesirable service properties.

Never Block

Deadlock may be avoided by coding Web Service implementations such that the Web Service never blocks. Whenever a Web Service request is received the Web Service MUST determine whether there are sufficient resources to satisfy the request. If the resources are insufficient to satisfy the request the Web Service MUST issue a failure response.

Time To Live / Hop Count

Internet Protocol avoids livelock conditions through the use of a Time To Live (TTL) counter. The initial sender sets the TTL counter to a positive value. The TTL counter is decreased each time a packet is transferred by a router. Packets with a TTL count of zero are discarded. Although the TTL count in IP is nominally a 'time out' limit for the packet, it is the bound on the maximum hop count that avoids the occurrence of livelock conditions.

Web Service applications that issue chained requests may implement a TTL counter as follows:

  1. If a Web Service request is received that does not specify a TTL counter, the Web Service specifies a positive  TTL counter in all chained requests.
  2. If a Web Service request is received that specifies a positive TTL counter, the Web Service specifies a lower valued TTL counter in all chained requests.
  3. If a Web Service request is received that specifies a  TTL counter of zero, the Web Service MUST NOT make any chained requests. If a chained request is necessary the error TTL 

Message Path

One disadvantage of the TTL method is that it requires upstream Web Services to anticipate the depth of chained requests that downstream Web Services may need to issue. Use of a message path approach analogous to the SMTP mail forwarding headers permits a more adaptive approach.

Using the message path approach each Web Service the issues a chained request inserts an element into a message path descriptor in the request header. If a Web Service receives a request that includes itself in the path the request is rejected.

The principal difficulty with the Message Path approach is that Web Service requests are not semantically equivalent in the same manner as SMTP email messages. If a mail agent finds that it is receiving a mail message that it previously forwarded an error has occurred. This is not the case with Web Service requests, a cycle in a communication graph may in some circumstances be desirable, it is only if the content of the requests is semantically equivalent that the possibility of livelock is introduced.

Security

[There is more to security than just signing and encrypting at random]

Threats

[We know what these are yadda yadda yadda]

Disclosure

Request Substitution

Protocol Substitution

Request Reply [No clear solution for this yet]

Response Replay

[but note that in certain contexts we may accept cached data]

Repudiation

Controls

XML Encryption and XML Signature

Service Identifier

The <ServiceIdentifier> element provides a means of defeating Protocol Substitution attacks.

The <ServiceIdentifier> element is included within the authenticated part of both the request and response and has the value of the service URI of the Web Service. Responders MUST reject any request that has a <ServiceIdentifier> value that is other than that of the port selected. Initiators MUST reject any response that has a <ServiceIdentifier> value that is different to that specified in the request.

Transaction Identifier

The <TransactionIdentifier> element provides a means of defeating Response Replay attacks. It is included in a request if the initiator requires that the response be fresh and not from cached data.

The <TransactionIdentifier> element is included within the authenticated part of both the request and response and has a nonce value with the property that the probability of two requests containing the same nonce value by chance is negligible. Initiators MUST reject any response that has a <TransactionIdentifier> value that is different to that specified in the request.

Authenticated Requests

In many cases requests are authenticated as a matter of course. In particular any Web Service that requires some form of payment or other authorization of requests will require full authentication of the request.

Non-Authenticated Requests

In a limited set of circumstances it is possible to dispense with authentication of the service request. For this to be possible it is necessary that:

For example the following Web Services are not suitable for use with non-authenticated requests:

The principle difficulty with using non-authenticated requests is that it is difficult to ensure that implementations do not have implicit dependencies on the information sent in a response. The use of this form is therefore deprecated, except for the case in which an authenticated response from cached data is to be issued.

Deferred Authentication of Requests

Deferred authentication allows a digital signature on the response to be used to authenticate a request. For this to be possible it is necessary that:

The deferred authentication protocol is as follows:

  1. The Initiator authenticates the request using a message digest function and includes the value in the request.
  2. The Responder verifies that the message digest value of the request is correct
  3. The Responder includes the message digest value of the request in the signed response
  4. The Initiator verifies that the message digest value specified in the response matches that of the request.

An implementation of the deferred authentication protocol requires only marginally greater resources than non-authenticated requests but entirely avoids the problem of implementation errors introducing subtle dependencies on unauthenticated data.

Summary

The choice of security mechanism is largely determined by the need to support non-repudiable responses and/or caching. If non-repudiable responses are required it will be necessary to digitally sign the responses leading to a likely requirement to support cached responses.

Threat Do not Support Caching Support Caching / Forwarding
Request Substitution
(Always)
Authenticated Request or Deferred Authentication Authenticated Request using Digital Signature or HMAC
Request Replay
(Sometimes)
Authenticated Request or Deferred Authentication Authenticated Request
Response Replay
(Sometimes)
Use <TransactionIdentifier> element Not Applicable
Request Repudiation
(Rare)
Authenticated Request using Digital Signature or HMAC Authenticated Request using Digital Signature or HMAC
Response Repudiation
(Sometimes)
Authenticate Response with Digital Signature or with HMAC Authenticate Response with Digital Signature and Archive Response
Protocol Substitution
(Always)
Use <ServiceIdentifier> element Use <ServiceIdentifier> element

References

[Roscoe]    Rosecoe & Dathi, The persuit of deadlock freedom, Oxford Programming Research Group