Re: i001: EPRs as identifiers: update to EPR pros and cons by adding ref prop scenario from David Booth on 2004-12-07 (public-ws-addressing@w3.org from December 2004)

From: David Booth <dbooth@w3.org>
Date: Tue, 07 Dec 2004 04:21:59 -0500
To: David Orchard <dorchard@bea.com>
Cc: public-ws-addressing@w3.org
Message-Id: <1102411288.3465.294.camel@nc6000.w3.org>
DaveO,

Thanks very much for the new example involving Reference *Properties*. 
This is very helpful.  First, some overall comments.

1. It is interesting to note that the example you added does not benefit
from *any* of the advantages that you attribute to Reference
Properties.  I imagine this is either because you deliberately kept it
simple, or because you deliberately selected an example that could be
demonstrated equally well both ways -- using URIs+RefProps or using URIs
alone.  In any case, I still have not seen any realistic examples that
demonstrate actual advantages of using URI+RefProps instead of URIs
alone as Web resource addresses.  I don't expect you to come up with
such examples, because I think you've already gone beyond the call of
duty in championing this issue, but it sure would be nice if someone
would.

2. A general editorial suggestion: The comparison often mentions EPRs,
but since the issue really is about the use of URI+RefProps -- not about
EPRs as a whole -- I think it would be clearer to replace "EPR" with
"URI+RefProps" (or something similar) in most instances.

Other detailed comments on your comparison are inline below.  (I
previously commented on your first example, but avoided making detailed
comments about your analysis, because I assumed that the analysis would
change when you updated the example.  But I guess now is the time for me
to make these detailed comments.)

On Fri, 2004-12-03 at 12:11, David Orchard wrote: 
> I offer a proposal for what the issue is, and a solution . . . .

The first few parts look great to me, up until the following paragraph.

> Note that the Web architecture does not discuss stateful versus
> stateless services or interactions, nor does it discuss the use of
> HTTP Cookies to contain state or state identifier information.  To a
> certain degree, the comparison of URIs vs EPRs may be thought of as a
> comparison of URIs + HTTP cookies vs EPRs.  However, this comparison
> will describe EPRs with reference properties without discussion of
> HTTP cookies, but the similarity of the benefits of HTTP cookies and
> EPR reference properties is captured in the EPR benefits.  

I am now guessing that you are intending to refer to a use of cookies in
which an application proceeds through several steps and the cookie
indicates the current step.  In essence, it is as though the cookie
indicates what page the user is on, i.e., the cookie is analogous to the
path part of a URI.  This use would indeed correspond to Reference
Properties.  However, since the most common use of cookies -- to
indicate someone's CustomerID or insert their name in a generic page --
is much more comparable to Reference *Parameters*, I think it would be
helpful to either delete this comparison with cookies, or more fully
explain the use of cookies that you intended.

> . . .
> Sample applications and benefits
> 
> Two sample application is introduced with a skeletal display of use of
> URIs and EPRs.  
>  
> 
> Sample application #1: Stateful Web service client
> . . . .

I think this first example can now be deleted, since the new example is
much clearer.  (The old one does not adequately differentiate between
RefProps and RefParams.)

> Scenario #2: Reference Properties + Parameters

Great!  This new example is very helpful.  However, one little issue . .
. 

> . . .
> <wsa:EndpointReference>
>  <wsa:Address>http://example.org/B/tx:Id/1234</wsa:Address>
> </wsa:EndpointReference>
> . . .

I think it would be a better, more direct comparison to move only the
Reference *Properties* into the URI (as Hugo suggested[1]), and leave
the Reference Parameters where they were (as Reference Parameters).  In
other words, it is easier to understand the difference between the two
approaches if only one variable is changed at a time.

Now on to your analysis . . .

> Comparison of Variations.
> 
> This comparison uses the Architecture properties of Key interest
> section from Dr. Fielding’s thesis [2] as the criteria for evaluating
> these 2 styles.  This is based upon the network characteristics of the
> architectures.  Note that the thesis specifically excludes those
> properties that are of interest to software architectures.

This seems very reasonable.  However, as Hugo noted, you omitted the
"Reusability" criterion.  I think it would be good to include that one
also, especially since that's a fairly imporant differentiator of URIs
over URI+RefProps: URIs are far more reusable.

> 
> Performance
> 
> Advantage of URIs
> 
> Comparing URIs is simpler than comparing EPRs because the cost of
> canonicalizing EPRs can be significant given the XML C14N algorithm.

Good.

> Scalability
> 
> No difference.  Both styles support stateful and stateless
> interactions.

If you limit your view to Web services that wish to use WS-Addressing,
then I would agree.  But if you look at scalability in a larger sense --
spanning Web technologies -- then the greater simplicity and
universality of URIs as Web resource identifiers clearly gives them a
scalability advantage, because URIs can be used not only as Web resource
identifiers *within* WS applications that use WS-Addressing, but also
*across* Web technologies.   Otherwise an application that spans Web
technologies would have to understand not only URIs but URI+RefProps. 
(And if a third kind of identifier is later invented, it would have to
understand that one also, and so on.)

> Simplicity
>
> Advantages of EPRs
> 
> Web services are based upon XML.  Many applications use XML as the
> mechanisms for identifying their components.  The binding of XML into
> URIs is not standardized and potentially problematic, . . . .

That's a red herring, because it is very unlikely that a mapping from
XML to URIs would be needed.  Applications that use URIs as Web resource
identifiers generally choose their application identifiers to facilitate
reasonable transcription as URIs.  They don't make up arbitrary XML
identifiers and then try to convert them to equivalent URIs using hairy
mapping algorithms.

Therefore, I suggest changing the above to:
[[
Advantages of URI+RefProps

Since Web services are based on XML, some applications may find the use
of XML structures marginally more convenient than URI paths as
identifiers.
]]

>  
> Advantages of URIs
> 
> In many cases, the complexity of XML is not needed for the identifier,
> as shown in example 2b.  

True, but I don't think this needs to be stated here, because this
section is supposed to be about the simplicity advantages of URIs.

> This enables the web service to be "on the Web" as an HTTP GET can be
> used to retrieve a representation of the state of the resource.  This
> also enablesb much of the web infrastructure to operate, such as
> caching intermediaries, security firewalls, etc..  This can lead to
> easier to debug systems (a web browser can retrieve the state for a
> human)..  

True, but that benefit is derived from using the same kind of Web
resource identifier as the rest of the Web uses -- not from the greater
simplicity of URIs.

I suggest changing the above comparison to simply:
[[
Advantages of URIs

URIs are clearly simpler than URIs+RefProps.
]]

> Visibility
> 
> Advantage of EPRs
> 
> EPRs provide 2 different visibility points, URIs and soap headers.  It
> may be advantageous to separate the visibility into 2 different types
> of software.  Looking at the RefP scenarios, the scope of visibility
> may be just a single URI for the transaction coordinator, and then
> there is another layer of security that is transaction specific.  This
> is embodied in the transaction software.  Providing two layers of
> security and separating these into 2 different extension points may be
> simpler and more appropriate than using a single extension point.  

This is not an advantage of EPRs, since the same thing can be achieved
by partitioning the URI path, as pointed out previously in this debate.

> Additionally, a service provider may not want for the reference
> property to be visible as part of the URI.  

That's not a visibility advantage, that's a security argument.  

> Presumably they could encrypt the reference property and then insert
> into the Address field, 

Again, the same thing can be done by partitioning the URI path.

> but this leaves us back to the simplicity argument and inserting XML
> into URIs.

As pointed out above, arbitrary XML would not be inserted into URIs if
URIs were used as Web resource identifiers, so I don't see the
applicability of this statement.

Therefore, to account for these points, I suggest changing the above
comparison to simply:
[[
Visibility

Advantage of URI+RefProps

None.
]]
 
> Advantage of URIs
> 
> URI only EPRs offer higher visibility for URI-only software into the
> message for any intermediary.

The phrase "for URI-only software" makes this statement sound biased. 
URIs offer higher visibility *because* they do not require specialized
knowledge of WS-Addressing in order to understand what Web resource is
being accessed.  Thus, I think a more neutral phrasing would be:
[[
Advantage of URIs

URIs offer higher visibility into the message for any intermediary.
]]

>  
> Evolvability
> 
> Advantage of EPRs
> 
> Separating the reference property from the URI may make it easier for
> service components to evolve.  A service component may know nothing
> about the deployment address of the service from the reference
> properties.  This effectively separates the concerns of identifiers
> into externally visible and evolvable from the internally visible and
> evolvable.  For example, a dispatcher could evolve the format it uses
> for reference properties without concern of the URI related software. 
> The use of SOAP tools - for parsing the soap header for the reference
> properties - or xml tools - such as an xpath expression on the message
> - allow separate evolvability of components.  

As pointed out in previous posts, the same thing can be done with URIs
by partitioning the URI path into two parts, so I don't think this is a
valid advantage.  Therefore I suggest changing this to:
[[
Evolvability

Advantages of URI+RefProps

None.
]]

> 
> Advantage of URIs
> 
> No advantage to URIs for evolvability.  

Actually, since URI+RefProps depends both on URIs and on XML,
applications that uses URI+RefProps as Web resource identifiers will be
locked into the idiosyncracies of *both* URIs *and* XML, thus making
them marginally less evolvable than if URIs alone were used.  This
probably is not a substantial difference but it is worth acknowledging
in passing.  Thus, I suggest rewording this comparison as:

[[
Advantage of URIs

No significant advantage to URIs for evolvability.
]]

> 
> Security
> 
> Advantage of EPRs
> 
> Dr. Fielding’s thesis does not directly address security.  One
> potential aspect of security is “guessing” at endpoints.  Encrypting
> the reference property does not cover signing a reference property.  A
> reference property might be encrypted and signed by a service provider
> using the OASIS WS-Security standard 

First, it is worth noting that this is an unusual application of
security that involves hiding part of the identity of the actual
recipient.  Far more common is to hide the *content* of the message,
while permitting the recipient to be known.

Second, if you *do* want to hide the identity of the recipient, a far
more natural and secure way to do it would be to encrypt and wrap the
entire message inside the body of another message.  (In other words, if
I want to send a message to Barry (via Alice) without revealing Barry's
identity or anything else about him, I send a sealed envelope to Alice. 
Alice unseals it, and inside is another envelope addressed to Barry.) 
This approach will hide the entire recipient identity (rather than just
a portion of it), in addition to hiding any other headers that were
targeted to the final recipient.  

Finally, even if you only wanted to secure a part of the identity (as
encrypting and signing the RefProps would do), the exact same thing
could be achieved by encrypting and signing a portion of the URI path. 

Thus, I don't think this is a valid advantage of URI+RefProps.

> 
> Real-World
> 
> Advantage of EPRs
> 
> It is useful to examine not only theoretical architectures properties
> but real-world deployed architectures.  A significant portion of the
> Web is deployed with stateful web components that use HTTP Cookies to
> contain session or state identifying information.  For a variety of
> reasons, typically those detailed previously, application developers
> have chosen to use HTTP Cookies to contain identifying information in
> addition to URIs.  

I don't buy this argument.  This argument essentially boils down to:
"It's common practice, therefore we should sanction it".  Spam is common
practice too, but that doesn't mean we should sanction spam.

> 
> Additionally, a variety of efforts have been undertaken to facilitate
> mapping of XML and QNames to URIs, such as WSDL 2.0 HTTP Binding. 
> There does not appear to be substantial product group interest in
> these technologies.  

As explained above, applications that use URIs as Web resource
identifiers are not likely to need such a generalized mapping, so this
point is moot.

> 
> Advantage of URIs
> 
> The subset of the Web that is "on the Web", that is has a URI that is
> dereferenceable, is clearly widely scalable, deployed, etc.

Okay.

> 
> Conclusion
> 
> This has shown that the choice of EPRs with Reference Properties
> versus EPRs without reference properties is a complex choice and there
> are pros and cons to both styles.  As they have the choice with a Web
> of HTTP URIs and HTTP Cookies today, WS-Addressing with ReferencePs
> gives Web service application developers the choice of their
> identifier architecture.  They can use URI only EPRs and they can use
> URIs + XML based reference properties and parameters.  

Obviously if you believe the points I made above, this conclusion does
not follow from the evidence presented.  Since any conclusion is
inherently a value judgement, I imagine the WG will have to debate what
it should say.  However, to throw in my two cents, I think something
like the following would be more in line with the evidence that I have
seen so far:
[[
Conclusion

This comparison has shown that although there may be some marginal
advantage to using URI+RefProps in some cases, it is clearly possible to
achieve a substantially equivalent result using URIs alone. 
Furthermore, there are substantial scalability, visibility and
reusability advantages in using URIs alone.
]]


References
1.
http://lists.w3.org/Archives/Public/public-ws-addressing/2004Dec/0051.html


-- 

David Booth
W3C Fellow / Hewlett-Packard
Received on Tuesday, 7 December 2004 09:22:14 UTC