W3C

EPRs and URIs comparaison

Draft 29 November 2004

This version:
http://www.w3.org/2004/11/epr_uris
Editor:
HH, marginally contributing to DaveO's work

Abstract

This document provides a comparison of EPRs and URIs.

Status of this Document

This is a first draft. It is heavily based on work made by Dave Orchard and the discussion thread that followed.


Short Table of Contents

1. Introduction
2. Motivation for EPRs
3. Comparison
4. Conclusion


Table of Contents

1. Introduction
2. Motivation for EPRs
3. Comparison
    3.1 Performance
        3.1.1 Advantages of URIs
    3.2 Scalability
    3.3 Simplicity
        3.3.1 Advantages of URIs
        3.3.2 Advantages of EPRs
    3.4 Evolvability
        3.4.1 Advantages of EPRs
    3.5 Visibility
        3.5.1 Advantages of URIs
        3.5.2 Advantages of EPRs
    3.6 Security
        3.6.1 Advantages of EPRs
    3.7 Reusability
        3.7.1 Advantages of URIs
    3.8 Real-World
        3.8.1 Advantages of EPRs
        3.8.2 Advantages of URIs
4. Conclusion


1. Introduction

Web Services Addressing introduces the concept of endpoint reference to refer to Web services endpoints. EPRs use XML constructs in addition to a URI to identify the destination of messages, while the architecture of the World Wide Web uses URIs to identify resources.

The W3C Web architecture [1] states "To benefit from and increase the value of the World Wide Web, agents should provide URIs as identifiers for resources. Other resource identification systems (see the section on future directions for identifiers) may expand the Web as we know it today. However, there are substantial costs to creating a new identification system that has the same properties as URIs."

This document provides an example justifying the need for extending the identification mechanism on the Web as well as a comparative description of what is gained and lost by using identifiers not purely based on URIs.

In the rest of the document, we refer to EPR identification for identifying components with a URI and reference properties, and URI identification to talk about EPRs using only URIs and no reference properties.

Note:

There are discussions about the role of reference parameters in the identification of endpoints. To me, reference parameters are only about state, similar to HTTP cookies (used to carry state only) and therefore do not serve to identify things. Dave's example may clarify this.

2. Motivation for EPRs

Editorial note 
Waiting for example from Dave

3. Comparison

This comparison uses the Architecture properties of Key interest section from Dr. Fielding's thesis as the criteria for evaluating these 2 styles. This is based upon the network characteristics of the architectures. Note that the thesis specifically excludes those properties that are of interest to software architectures.

3.1 Performance

3.1.1 Advantages of URIs

Comparing URIs is simpler than comparing EPRs because the cost of canonicalizing EPRs can be significant given the XML C14N algorithm.

3.2 Scalability

No differences.

3.3 Simplicity

3.3.1 Advantages of URIs

It is obvious that a URI is simpler than a URI plus some XML as an identifier.

3.3.2 Advantages of EPRs

Editorial note 
This presupposes that XML is needed as an identifier for resources, and therefore that a mapping to URIs is required

Web services are based upon SOAP and XML. Many applications use XML as the mechanisms for identifying their components. The binding of XML into URIs is not standardized and potentially problematic, some of the issues being:

  • XML contains QNames as element names, attribute names, and content. QNames are based upon absolute URIs. URIs in URIs is not simple.

  • XML elements can have multiple children at all levels, whereas URIs have path hierarchy that ends in a multiple children query parameters.

  • The XML information model is complex with attributes, elements, PIs, comments, entity references and whitespace. These do not match well to URIs.

  • Character encodings are different between XML and URIs.

  • URIs have potential length restrictions

  • URIs have different security properties than SOAP header blocks, such as level of encryption and signing.

XML applications that use XML for identification will probably be simpler to write with EPRs than with URI only identifiers. This includes SOAP tools and XML tools.

3.4 Evolvability

Editorial note 
There is no consensus around this argumentation; I believe that this is an application design choice, as one can do dispatching based on URIs just like all Web servers do today, and achieve the same separation of concerns and evolvability

3.4.1 Advantages of EPRs

Separating the reference property from the URI may make it easier for service components to evolve. A service component may know nothing about the deployment address of the service from the reference properties. This effectively separates the concerns of identifiers into externally visible and evolvable from the internally visible and evolvable. For example, a dispatcher could evolve the format it uses for reference properties without concern of the URI related software. The use of SOAP tools - for parsing the soap header for the reference properties - or xml tools - such as an xpath expression on the message - allow separate evolvability of components.

Note:

There are some claims that reference properties provide a protocol-independent identifier separate from protocol-dependent URI. The specification says that ref P's are protocol-dependent.

3.5 Visibility

3.5.1 Advantages of URIs

URI only EPRs offer clearly higher visibility into the message for any intermediary.

3.5.2 Advantages of EPRs

Editorial note 
This seemed to be more a security discussion than a visibility discussion; so I moved it down.

3.6 Security

3.6.1 Advantages of EPRs

URIs provide for visibility into the interaction between two components. Inserting the reference property may hinder visibility. The security desired may be at the address level, and inserting the URI serialization of the ref property may harm the ability to appropriately apply security. For example, the Address may already have query parameters that are part of the service identifier, and the reference property as a query parameter may result in difficult parsing as the query parameters are not necessarily order preserved. Potentially multiple reference properties compounds the problem.

Additionally, a service provider may not want for the reference property to be visible as part of the URI. Presumably they could encrypt the reference property and then insert into the Address field, but this leaves us back to the simplicity argument and inserting XML into URIs.

However, it seems that the type of information that one would want to secure falls more in the reference parameters category than in the reference properties one. Also, one could use two identifiers rather than splitting an identifier in two. Also, a portion of a URI could also be encrypted.

3.7 Reusability

3.7.1 Advantages of URIs

URIs are ubiquitous on the Web and in Web technologies.

Using URIs to identify the destination endpoint of a message has the following advantages:

  1. It gives access a wide range of Web technologies (e.g. XForms, RDF, etc.) without having to redefine anything.

  2. An endpoint destination can still be pasted easily in an email or put on a billboard.

In other words, it allows reuse outside the set of specifications that were built on top of Web Services Addressing.

3.8 Real-World

It is useful to examine not only theoretical architectures properties but real-world deployed architectures.

3.8.1 Advantages of EPRs

Editorial note 
I believe that this is talking about cookies as reference properties, which is not the most common use of cookies IMO.

A significant portion of the Web is deployed with stateful web components that use HTTP Cookies to contain session or state identifying information. For a variety of reasons, typically those detailed previously, application developers have chosen to use HTTP Cookies to contain identifying information in addition to URIs.

Note:

It is worth noting that this real-world practice goes against the Web architecture as published by the TAG as the TAG calls for URIs for identifying resources.

3.8.2 Advantages of URIs

The subset of the Web that is "on the Web", that is has a URI that is dereferenceable, is clearly widely scalable, deployed, etc.

4. Conclusion

Not using URIs takes you out of the Web. The advantages of reference properties do not seem to outweigh the cost of their introduction. Also there are other ways to accomplish what their goal is.