Noah Mendelsohn - IBM Research, Cambridge, MA
For the W3C Technical Architecture Group (TAG)
4 January 2006
Use case 1: A printer controller with a Web server
Use case 2: A printer controller with Web Services
Use case 3: The same resource supported using the Web and Web Services
The TAG very much appreciates the opportunity to participate in this workshop. Although a few TAG members have direct experience building and supporting enterprise-grade networking systems, most of us have far deeper knowledge of the World Wide Web and of the technologies that have been used to build it. Accordingly, our primary interest in attending the workshop is to learn from the many participants who have greater experience in building and deploying enterprise-grade networking systems. We also hope it will be useful to contribute some of the insights we've gained in designing and guiding the Web itself, and to participate in a dialog regarding the tradeoffs to be made in coordinating Web Services (WS) technologies with core World Wide Web technologies such as URIs and HTTP.
This white paper is intended to set out a few of the issues as we understand them, and to share some ideas about architectural tradeoffs. We do not attempt here to suggest what "the right answers" should be, but rather to offer some ideas that we hope will promote useful discussion. In keeping with the overall style of the workshop, we focus mainly on analyses motivated by use cases, and conclude with some discussion of the implications. Specifically, we ask the question: should WS and the Web be disjoint systems that share some technology, or should the two be more tightly integrated?
To explore that question, we present as use cases three variations on the same theme: providing Web and/or WS-based control and query of an Internet-connected printer. The first use case discusses a traditional Web-based control interface; the second explores the characteristics of a pure WS-based approach; the third presents a printer that supports both interfaces simultaneously.
Our first use case involves an Internet-connected printer that can be queried and controlled using standard Web protocols. The printer is identified by a URI, and it includes a simple Web server that supports HTTP. The following capabilities are supported:
http://example.org/printers/printer1
seems to be hung, can you fix it?".
When the link is clicked,
e-mail readers automatically launch browsers or other user agents, which in turn use HTTP GET
to access the printer.
http://example.org/printers/printer1?parameter=tonerLevel+tonerColor=blue
Some of the lessons we can learn include:
These benefits arise because the printer is identified by the same URI regardless of the reason it's referenced, and because the same widely-deployed HTTP protocol (and particular the HTTP GET operation using content negotiation) provides uniform access for a wide variety of purposes.
What are some disadvantages of the Web approach? The Web architecture is somewhat less helpful in establishing a standard means of documenting and controlling printer-specific operations and parameters. For example, there is today no standard means by which a program development tool will discover that some particular HTTP POST will cause the print queue to be purged (though such descriptions could be built using RDF). As we'll see, the Web also does not provide some of the other richer qualities of service that WS enables, and it provides fewer facilities for exploiting enterprise protocols other than HTTP.
Our second use case employs only Web Services to control and query the printer.
The printer is part of a cluster,
and a single URI
http://example.org/PrinterCluster1
identifies the entire cluster.
All printers on the network can be queried and manipulated using SOAP-based Web service operations,
each of which is invoked by a SOAP envelope sent using HTTP POST.
Individual printers are named using WSA End Point References (EPRs) such as:
<wsa:To> <wsa:address>http://example.org/PrinterCluster1</wsa:address> <prt:printerBuilding>Building1</prt:printerBuilding> <prt:printerNumber>3</printerNumber> </wsa:To>
Each printer supports a standard set of operations, such as "PurgePrintJob", and the associated WSDL defines the names and types of the parameters required for each. Programming tools use the WSDL to provide lists of available printer operations, to prompt for the necessary control parameters, and to facilitate interpretation or response messages.
Using higher level WS* constructions,
a printer can reliably implement operations that may take
days to complete.
For example, a signal might be sent using SOAP indicating that the printer should respond
when a human maintenance operator has performed preventive maintenance on the printer.
The response can be addressed using <wsa:replyTo>
, and both the request
and response can be reliably delivered even if intermittent network outages are encountered.
If desired, WS Security can be used not just to authenticate and secure the connection to
the printer, but to sign the "purge" operation with the private
key of the particular system
manager who issued the request.
Printer control operations can also be transmitted using non-HTTP networks,
such as the Java Message Service (JMS) or proprietary reliable queing systems
(IBM WebSphere® MQ,
Microsoft Message Queuing,
SonicMQ® etc.)
Among the disadvantages of WS, when used in this style, are the inability to fully leverage deployed HTTP infrastructure, Web user agents, databases, etc. Printers named with EPRs cannot directly be accessed from a Web browser, and an EPR sent in an e-mail message is unlikely to be as useful as a URI would be. Even if the printer is named with a URI, it is HTTP GET as opposed to POSTing of a SOAP envelope that is the default safe query operation performed by Web user agents when following a link. Although WS Transfer might provide a degree of uniformity in performing status queries for WS, it would do so in a manner that is parallel to rather than integrated with the Web's HTTP GET mechanism. How will browsers know which resources to access using WS Transfer GET, and which to access using HTTP?
The tradeoffs exposed above should not be surprising. As has been repeatedly discussed under the banner "Network Effects" or sometimes " Metcalfe's Law", the value of each resource in a network tends to grow very rapidly as the number of interconnected resources and users increases. The printer in example 1 was more valuable because browsers could control it, because users could e-mail links to it, and so on. The value of a user's Web page might be enhanced because it could contain a link to his or her favorite printer. So, two or more separate networks, one the World Wide Web and the others built from WS, are likely to be of lower value than a single network integrating the same resources.
So, it's very tempting to ask: can we, at least for some resources, preserve the benefits of WS while fully integrating those resources into the Web? This third use case explores one approach that might work for some resources. We again focus on the network enabled printer, this time supporting it simultaneously using Web and Web Services technologies:
http://example.org/printers/printer1
), as in use case 1.application/soap+xml
,
the printer responds with a SOAP envelope containing device status.
The envelope can contain SOAP headers with, for example, digital signatures
to confirm that the status was really sourced from the printer, encryption to
protect sensitive information, etc.
Note that the SOAP Recommendation provides for such use of HTTP GET, though
support for it is not widely deployed today.
Although the approach outlined in use case 3 has many desirable characteristics, it involves a number of tradeoffs and is probably not appropriate for every WS resource. First of all, Web Services are designed to be usable with network infrastructures such as reliable queuing systems (e.g. JMS, Websphere MQ, MSMQ), hardware or software publish/subscribe busses, etc., none of which are today well supported by the Web. Some resources will require those non-HTTP interfaces.
Tooling support for complex resource-specific interfaces also seems to be further advanced in WS. Although some of the recent work on Web description languages (e.g. WADL or the WSDL 1.2 HTTP Binding) is promising, WSDL and SOAP together currently provide a much richer design-time framework to enable convenient tooling, especially for interacting with services that involve large numbers of operations and parameters.
Indeed, use case 2 makes the point that the RESTful approach of assigning a URI to
every resource can be inconvenient, particularly when large numbers of properties with
parameterized names would each have to be represented in its own URI.
It's relatively easy to motivate giving the printer
its own URI; it's somewhat harder to decide whether to also assign a separate
URI to each
paper drawer in the printer, to each part of the printer that can jam, etc.
Web Services does provide alternatives for addressing such
very fine grained resources.
These approaches sacrifice deep integration with
the Web in favor of (potentially) increased user convenience.
Specifically,
typed WS operations such as
getDrawerStatus(drawerNumber)
can be used, or the resource
(in this case the drawer) can be identified
using an EPR reference parameter as shown in use case 2.
End Point References also have some potentially useful characteristics
relating to ease of processing using SOAP software, to the fact that
reference parameters are position independent (position is significant
in URI query strings), and that parameters
carry fully qualified names (QNames).
Notwithstanding these desirable properties of Web Services, network effects (Metcalfe's Law) remain a powerful incentive to integration with the World Wide Web. Specifically, use case 3 illustrates important advantages that come from naming even WS resources with URIs, from supporting access using both HTTP GET and POST, from enabling HTTP GET support in SOAP, and from using content negotiation where appropriate. Note that the assigned URIs can be encapsulated in EPRs for use with WS addressing; what's essential is that any accompanying reference parameters be used only for metadata (e.g. a signature to ensure the integrity of the URI) and not to further identify the resource (the name of the printer in the cluster). If this hybrid approach will not be the right answer for all WS resources, it does appear to be an import idiom for many. Certainly we believe that this approach deserves serious consideration.
In fact, the TAG has worked over several years with other W3C workgroups to enable just this sort of exploitation of Web technologies with WS. The TAG issue whenToUseGet-7 tracks the TAG's efforts to ensure that HTTP GET is appropriately used. Under the banner of that issue, the TAG worked in 2002 with the W3C XML Protocols WG to ensure that SOAP would indeed support both HTTP GET and the appropriate use of URIs to identify resources. Issue endPointRefs-47 focusses on the coexistence of WSA Endpoint References with URIs as identification mechanisms in WS*. The TAG's work with the WSA workgroup led to inclusion in the WS Addressing 1.0 - Core of an admonition to avoid the use of reference parameters to identify resources.
The W3C recently accepted a WS-related submission titled: Web Services Transfer (WS-Transfer). Although no workgroup activity on WS Transfer is currently planned, the W3C team commented in accepting the submission on several technical issues raised by WS Transfer, and suggested that the TAG should investigate them. The issues mentioned include the overlap in the services provided by WS Transfer and by HTTP, and also the use of EPRs for resource identification in WS Transfer. A detailed discussion of WS Transfer is beyond the scope of this note, and the TAG is in any case just beginning serious discussion of it. We hope to learn more about use cases for WS Transfer at the workshop.
Ten years ago, it would have been unusual to find a Web server embedded in a printer. Indeed, if one had asked a printer manufacturer about including such a capability one might have gotten a quite puzzled: "We know how to control printers. The Web is for getting stock quotes and reading news reports. They're different." Yet today, it's common to find Web servers embedded in printers. Web Services resources, however, are often not enabled for Web access. In this note we ask whether those resources too might benefit from better Web support? Given that much of the WS stack already uses Web technologies such as URIs and HTTP we also ask a related question: are those Web technologies being used by WS in a way that maximizes value?
Although we approach this workshop with open minds and a desire to learn from many perspectives, the TAG has a particular responsibility to promote the appropriate use of the Web itself. The value of network effects is extraordinary when hundreds of millions of resources are interconnected on a global scale. There is also great value in exploiting the nearly ubiquitous deployment of Web proxies, user agents, Web-enabled databases, and other tools, all of which depend on appropriate use of Web technologies such as URIs, HTTP get, etc. So, we think that integration of Web and WS technologies is worth at least very careful thought. The best answer may well be in some or even all cases not to pursue such integration, but it's in an important decision. At very last, we hope this note has been useful in setting out some of the tradeoffs to be considered.