Re: i0001: EPRs as identifiers





I appreciate very much Dave's proposal and the advantage of having his Web
architecture expertise to help us navigate these murky waters. However,
before we approach the TAG or do anything drastic like that, I think I need
to explain my qualms about the way we seem to be approaching this issue; I
have the feeling I am in a minority in this, but I fear an important
perspective may be lost in this discussion.

First, we need to admit that the specification contains specific language
to prompt this concern and the current resolution approach. In particular,
Section 2.1 in the core document draft talks about using reference
properties to "identify" the entity being conveyed. However, I claim that
this language is misleading and not consistent with the spirit of the spec;
moreover, if we take it to its final conclusion important use cases will
not be appropriately supported.

My central point is that the idea that EPRs identify service endpoints is
wrong and potentially dangerous. There is a key different between
identifiers and addresses. The Web architecture document correctly states
that different identifiers (URIs) should not be associated with the same
resource; while aliases are not prohibited, they do undermine the
usefulness of URIs as identifiers and impose unnecessary cost on URI
consumer applications. The notion of identifiers is essentially at odds
with having multiple for the same entity.

However, in the case of EPRs the ability to issue multiple different ones
for the same endpoint is a fundamental requirement. There are situations
where multiple access channels to the endpoint are provided but they need
to be selectively exposed to different clients; the hosting infrastructure
may issue different EPRs to different clients such that a client
application would be able unable to tell whether they correspond to the
same resource or not. To abuse the snail mail metaphor a bit more, the
address that is encoded in a letter sent to me cannot be used to identify
me in the way my social security number can, but is useful if you want to
get your letters to me. I have several mail addresses but a single social
security number. The notion of address assumes that the same entity may
have many, in contrast with the notion of identity (as the Web Arch.
document recognizes).

A "wire-centric" interpretation of EPRs, I think, is much more consistent
with the spirit of Web services than this identity based approach. An EPR
encapsulates the information that must be conveyed in a message envelope to
ensure that it can be properly delivered to the endpoint. This is clearly
not the same as providing an identifier for the resource, but is essential
to ensure interoperable access to endpoints.  Regardless of how
inconsistent the language in the current spec is today, we need to make it
clear that WS-Addressing is not about identifying resources but only about
providing the means to interoperably direct (address) messages to them.
Moreover, I think we should state that the identification of endpoints is
not within the scope of this WG. We just need to make these points clear in
the spec and let the TAG rest at ease knowing we are not trying to break
their carefully constructed Web architecture.

Paco







                                                                                                                                               
                      "David Orchard"                                                                                                          
                      <dorchard@bea.com>              To:       <public-ws-addressing@w3.org>                                                  
                      Sent by:                        cc:                                                                                      
                      public-ws-addressing-req        Subject:  i0001: EPRs as identifiers                                                     
                      uest@w3.org                                                                                                              
                                                                                                                                               
                                                                                                                                               
                      11/13/2004 01:13 AM                                                                                                      
                                                                                                                                               




I offer a proposal for what the issue is, and a solution

Issue:

WS-Addressing EPRs specify a resource identification mechanism, called
reference properties and reference parameters, in addition to URIs for
identification purposes.  There is not a clear justification of the
benefits of such an additional resource identification mechanism.

The W3C Web architecture [1] states “To benefit from and increase the value
of the World Wide Web, agents should provide URIs as identifiers for
resources.  Other resource identification systems (see the section on
future directions for identifiers) may expand the Web as we know it today.
However, there are substantial costs to creating a new identification
system that has the same properties as URIs.”

The W3C TAG was asked the question about when to use GET for retrieving
resource representations and indirectly about when URIs should be provided
for resources in Issue #7 [2] and produced a finding [3].  Some of the
finding material is included in the Web arch document.  The Web
architecture is clear that there are substantial costs associated with
resource identification systems other than URIs and the implication is that
the benefits to such additional systems should be substantial.

The URI specification [4] provides a definition of a resource “A resource
can be anything that has identity.”  Thus we do not need to determine
whether an EPR identifies a resource or not, but whether an EPR is used as
an identifier.

The WS-Addressing member submission [5] is fairly clear that EPRs are used
for identification purposes.  Some sample quotes used in the document:
“Dynamic generation and customization of service endpoint descriptions. “
“Identification and description of specific service instances”
“we define a lightweight and extensible mechanism to dynamically identify
and describe service endpoints and instances”
“Specific instances of a stateful service need to be identified”
“A reference may contain a number of individual properties that are
required to identify the entity or resource being conveyed”

A tell-tale sign of identifiers is comparisons of identifiers.  The URI
specification provides rules for URI comparison.  The WS-Addressing
submission provides rules for EPR comparison.

Note that the Web architecture does not discuss stateful versus stateless
services or interactions, nor does it discuss the use of HTTP Cookies to
contain state or state identifier information.  To a certain degree, the
comparison of URIs vs EPRs may be thought of as a comparison of URIs + HTTP
cookies vs EPRs.  However, this comparison will describe EPRs with
reference properties without discussion of HTTP cookies, but the similarity
of the benefits of HTTP cookies and EPR reference properties is captured in
the EPR benefits.

[1] http://www.w3.org/TR/webarch/#uri-benefits

[2] http://www.w3.org/2001/tag/issues.html#whenToUseGet-7

[3] http://www.w3.org/2001/tag/doc/whenToUseGet.html

[4] http://www.ietf.org/rfc/rfc2396

[5] http://www.w3.org/Submission/2004/SUBM-ws-addressing-20040810/



Resolution:
The WS-Addressing WG will provide material, TBD format such as standalone
or primer or …, that shows the benefits to be gained from WS-Addressing
reference properties and parameters.  It includes a comparison with URI
only solutions.

Sample applications and benefits
A sample application is introduced with a skeletal display of use of URIs
and EPRs.

Sample application #1: Stateful Web service client.

A stateful service acting as a client makes a request to another service.
The client makes a request containing a ReplyTo containing an EPR.  The
invoked service responds with the requested information including an
WS-Addressing EPR processing model

Variation #1: Reference properties
Client->Service:request
<s:Header>
<wsa:ReplyTo>
   <wsa:EndpointReference>
  <wsa:Address>http://www.fabrikam123.example/acct</wsa:Address>
   <wsa:ReferenceProperties>
       <fabrikam:CustomerKey>123456789</fabrikam:CustomerKey>
   </wsa:ReferenceProperties>
  </wsa:EndpointReference>
</wsa:ReplyTo>
</s:Header>

Service->Client Callback

<S:Header>
    <wsa:To>http://www.fabrikam123.example/acct</wsa:To>
    <fabrikam:CustomerKey>123456789</fabrikam:CustomerKey>
</S:Header>

Variation #2a: Address only with full featured Qname to URI mapping

This takes the fabrikam:CustomerKey Qname and content and incorporates it
into the Address using an extension of QName/URI binding style #10 in [1]
Client->Service:request
<s:Header>
<wsa:ReplyTo>
   <wsa:EndpointReference>

<wsa:Address>http://www.fabrikam123.example/acct?(fabrikamns)CustomerKey=123456789</wsa:Address>
  </wsa:EndpointReference>
</wsa:ReplyTo>
</s:Header>

Service->Client Callback

<S:Header>

<wsa:To>http://www.fabrikam123.example/acct?(fabrikamns)CustomerKey=123456789</wsa:To>
</S:Header>

Variation 2b: Simple Address
The Address may be significantly simpler, such as
  <wsa:Address>http://www.fabrikam123.example/acct/123456789</wsa:Address>


Comparison of Variations.

This comparison uses the Architecture properties of Key interest section
from Dr. Fielding’s thesis [2] as the criteria for evaluating these 2
styles.  This is based upon the network characteristics of the
architectures.  Note that the thesis specifically excludes those properties
that are of interest to software architectures.

Performance
Advantage of URIs
Comparing URIs is simpler than comparing EPRs because the cost of
canonicalizing EPRs can be significant given the XML C14N algorithm.


Scalability
No difference.  Both styles support stateful and stateless interactions.


Simplicity

Advantages of EPRs
Web services are based upon XML.  Many applications use XML as the
mechanisms for identifying their components.  The binding of XML into URIs
is not standardized and potentially problematic, some of the issues being:
- XML contains QNames as element names, attribute names, and content.
QNames are based upon absolute URIs.  URIs in URIs is not simple.
- XML elements can have multiple children at all levels, whereas URIs have
path hierarchy that ends in a multiple children query parameters.
- The XML information model is complex with attributes, elements, PIs,
comments, entity references and whitespace.  These do not match well to
URIs.
- Character encodings are different between XML and URIs.
- URIs have potential length restrictions
- URIs have different security properties than SOAP header blocks, such as
level of encryption and signing.

XML applications that use XML for identification will probably be simpler
to write with EPRs than with URI only identifiers.  This includes SOAP
tools and XML tools.

Advantages of URIs
In many cases, the complexity of XML is not needed for the identifier, as
shown in example 2b.  This enables the web service to be "on the Web" as an
HTTP GET can be used to retrieve a representation of the state of the
resource.  This also enables much of the web infrastructure to operate,
such as caching intermediaries, security firewalls, etc..  This can lead to
easier to debug systems (a web browser can retrieve the state for a
human)..


Evolvability
Advantage of EPRs
Separating the reference property from the URI may make it easier for
service components to evolve.  A service component may know nothing about
the deployment address of the service from the reference properties.  This
effectively separates the concerns of identifiers into externally visible
and evolvable from the internally visible and evolvable.  For example, a
dispatcher could evolve the format it uses for reference properties without
concern of the URI related software.  The use of SOAP tools - for parsing
the soap header for the reference properties - or xml tools - such as an
xpath expression on the message - allow separate evolvability of
components.

Advantage of URIs
No advantage to URIs for evolvability.

Visibility
Advantage of EPRs
URIs provide for visibility into the interaction between two components.
There are scenarios that indicate visibility into the reference property is
not necessary.  Inserting the reference property may hinder visibility.
The security desired may be at the address level, and inserting the URI
serialization of the ref property may harm the ability to appropriately
apply security.  For example, the Address may already have query parameters
that are part of the service identifier, and the reference property as a
query parameter may result in difficult parsing as the query parameters are
not necessarily order preserved.  Potentially multiple reference properties
compounds the problem.

Additionally, a service provider may not want for the reference property to
be visible as part of the URI.  Presumably they could encrypt the reference
property and then insert into the Address field, but this leaves us back to
the simplicity argument and inserting XML into URIs.

Advantage of URIs
URI only EPRs offer clearly higher visibility into the message for any
intermediary.

Security
Advantage of EPRs
Dr. Fielding’s thesis does not directly address security.  One potential
aspect of security is “guessing” at endpoints.  Encrypting the reference
property does not cover signing a reference property.  A reference property
might be encrypted and signed by a service provider using the OASIS
WS-Security standard


Real-World
Advantage of EPRs
It is useful to examine not only theoretical architectures properties but
real-world deployed architectures.  A significant portion of the Web is
deployed with stateful web components that use HTTP Cookies to contain
session or state identifying information.  For a variety of reasons,
typically those detailed previously, application developers have chosen to
use HTTP Cookies to contain identifying information in addition to URIs.

Additionally, a variety of efforts have been undertaken to facilitate
mapping of XML and QNames to URIs, such as WSDL 2.0 HTTP Binding.  There
does not appear to be substantial product group interest in these
technologies.

Advantage of URIs
The subset of the Web that is "on the Web", that is has a URI that is
dereferenceable, is clearly widely scalable, deployed, etc.

Conclusion
This has shown that the choice of EPRs with Reference Properties versus
EPRs without reference properties is a complex choice best left to the
application developer.  As they have the choice with a Web of HTTP URIs and
HTTP Cookies today, WS-Addressing gives Web service application developers
the choice of their identifier architecture.  They can use URI only EPRs
and they can use URIs + XML based reference properties and parameters.

[1] http://www.pacificspirit.com/blog/2004/04/29/binding_qnames_to_uris

[2]
http://www.ics.uci.edu/~fielding/pubs/dissertation/net_app_arch.htm#sec_2_3

Received on Saturday, 13 November 2004 17:35:04 UTC